Catalyzing Science with Open Data — Crowdsourcing and Citizen Science

New Frontiers with Data Exploration

Data are a valuable national asset, as highlighted in the first part of this Open Science series. Big, open, interoperable data and technology are transforming society, science, and medicine by unlocking new answers to complex questions previously unsolvable. Data also underlie science, guiding evidence-based endeavors and the core of the scientific method.

Equally important are democratizing forces arising from open data and the big data revolution. By allowing equal access to scientific information, today’s science engages new, diverse communities beyond the ivory tower for novel insights and scientific discovery. Today’s playing field is more level than it has ever been. Data and open data access potentially turn all global citizens into scientists, for example, through crowdsourcing and citizen science.

Crowdsourcing and Citizen Science for Societal Benefits

Definitions vary but, generally:

Crowdsourcing is “the practice of obtaining needed services, ideas, or content by soliciting contributions from a large group of people and especially from the online community rather than from traditional employees or suppliers.”

Citizen science is “scientific work undertaken by members of the general public, often in collaboration with or under the direction of professional scientists and scientific institutions.”

(For more definitions, see my previous article’s glossary here.)Screen Shot 2015-05-13 at 10.40.23 PM

The data revolution empowers the human dimensions of science and society. Crowdsourcing and citizen science have both grown in visibility and importance with the big data revolution and its priority on open data and the innovation ecosystem, which connects all of us through the Internet and 21st Century technologies. Unique opportunities exist today to think differently with new technology, big-data analysis, and interconnected networks for solutions never before imagined. For example, as evidenced by recent initiatives like PCORI and the Precision Medicine Initiative with Blue Button data to empower each and every American citizen with control over their own health information, research and medicine are transforming from expert-centered to patient-centered. This scope of research increasingly engages every-day citizens and patient organizations, which may hold the keys to the kingdom in big data—patient engagement and consent.

Once-baffling research challenges are today being solved by heterogenous, arguably eclectic, crowdsourced efforts. We are engaging new minds in the general public, such as online gaming communities, to help advance scientific discovery.

As with anything new and unconventional, crowdsourcing and citizen science are not without controversy. Some professional researchers question data validity from amateurs. Evaluation of citizen science programs, however, shows that well-designed programs with proper training (if necessary) yield quality and robust information.

A second potential challenge to catalyzing citizen science with open data could be legal barriers. For example, in 2015, Wyoming lawmakers passed the Trespassing to Collect Data Bill, creating new state law that bans data collection on private land without permission from the landowner. Ambiguity in this law is sweeping, as it applies to all “open land”—however Wyoming courts will interpret this—without exceptions for university or government researchers. The final chapter and specific interpretations are not yet written by U.S. courts, yet it is clear that crowdsourcing, citizen science, and open data are here to stay. Perhaps we are amidst another case of a collective paradigm shift in scientific thinking where “the first one through the wall always gets bloody” to paraphrase Hollywood (Moneyball), while the Western Watersheds Project fights in the courts for citizen science.

There are many great examples of crowdsourced solutions and citizen discoveries making breakthroughs across multiple communities, media modes, disciplinary, and interdisciplinary applications. The most comprehensive repository is SciStarter that lists 1,000 active, searchable citizen science projects. Out of the limitless possibilities, some examples:

  • ChemSpider is an open-source public database and free resource for chemists with over 20 million compounds and 34 million structures from over 300 different data sources.
  • FoldIt video game engages the online gaming community to solve for protein shapes, providing novel insights to help unlock understanding critical for HIV understanding, human aging/disease, biofuels, pharmaceuticals, chemicals, and more.
  • The Great Sunflower Project asks people to collect data on pollinators in their yards, gardens, schools, and parks. The huge influx of data allows for better understanding of where — and possibly why — pollinators are at risk.
  • Jellywatch.org smartphone app allows people walking on beaches to report jellyfish and velella sightings, which help researchers to better understand these marine species and our oceans.
  • Nanocrafter teaches players how to build devices with DNA through a game-style interface. The game, from the same researchers who developed FoldIt, seeks to expand the volume of people working in the field by orders of magnitude.
  • Reefcheck.org harnesses the power of thousands of trained volunteer SCUBA divers, sometimes funded through crowdsourced campaigns, to survey fish and reefs with Reef Check information used by scientists and government to help monitor the health of reefs, marine reserves, and ocean ecosystems.
  • SETI@home is a scientific experiment that uses Internet-connected computers in the Search for Extraterrestrial Intelligence (SETI). Citizens can participate by running a free program that downloads and analyzes radio telescope data in search of ET.

These projects combine a high level of enthusiasm and interest in science with an arguably low barrier to entry compared to conventional time and knowledge requirements for M.S. and Ph.D. degrees in science. Anyone walking on a beach can photograph a jellyfish or other ecosystem indicators, no scientific training required. Anyone who owns a computer can download free SETI applications to run at night, no time commitment while you sleep.

I believe Albert Einstein epitomizes the power of crowdsourcing and citizen science: “We cannot solve our problems with the same thinking we used when we created them.” Keep an open mind. Explore. Contribute. We are all scientists and have ideas to offer, each in our own unique way. With today’s growing momentum of open data, crowdsourcing, and citizen science, where will discoveries come from next?


This is the second in a new Capital Chemist series of “Catalyzing Science with Open Data” posts. Stay tuned for future posts on how open data matters to you!

Part 1: Catalyzing Science with Open Data – What is Open Data?

Please tweet @khoney with hashtag #OpenScience to suggest future topics, projects, and success stories.

The #OpenScience blog series is cross-posted by the American Association for the Advancement of Science (AAAS) Fellowship Program AAAS Sci on the Fly blog.


Photo credit: tudorCC BY-NC-ND 2.0