This page has links to research data that can be used for your Individual Project in the field of:
You can find datasets from many disciplines, including environmental and social sciences, as well as government data and data provided by news organizations, in:
Our World in Data: Datasets for each graph can be downloaded
World Bank: Various databases per country, including cause of death, agriculture, economics, education, climate change, economics, aid, health and nutrition
Data.gov Open Federal, state and local data from the United States government.
MorphoBank: Provides collaborative tools for researchers to upload images and morphological data, and use that information to produce, edit, illustrate and annotate phylogenetic matrices. Also a repository for data associated with peer-reviewed publications
Many zoos and wildlife areas offer live webcams from which ethology data can be collected
Catalog of Life: Single integrated species checklist and taxonomic hierarchy - holds essential information on the names, relationships and distributions of over 1.6 million species
Data Basin: Provides free access to biological, physical and socioeconomic geospatial data and maps, along with tools to create custom visualizations, drawings and analyses
Global Biodiversity Information Facility (GBIF): Facilitates free and open access to biodiversity data, enabling anyone to discover, use or publish data about all types of life on Earth
Integrated Taxonomic Information System (ITIS): Authoritative taxonomic information on plants, animals, fungi and microbes of North America and the world. Full database or specific taxonomic group data available for download
Knowledge Network for Biocomplexity (KNB): International repository for ecological and environmental data. Data originate from field stations, laboratories, research sites and individual researchers around the world
The Long Term Ecological Research Network (LTER): A collaborative of researchers and graduate students who focus on long-term ecological processes at 26 LTER sites around the United States, Antarctica, and islands in the Caribbean and Pacific. Contains ecological data packages contributed by past and present LTER sites
EDDMaps (invasive species)
iMapInvasives (invasive species)
Reefbase Coral reef data
Multivariate Analyses of Small Theropod Dinosaur Teeth and Implications for Paleoecological Turnover through Time Andrew Farke: “This paper includes a massive dataset of measurements for over 1,000 teeth of small carnivorous dinosaur”
Avian Knowledge Network (AKN): Partnership of people, institutions and government agencies that supports the conservation of birds and their habitats by improving access to and use of data and tools. Data available on bird-monitoring, banding and citizen-based bird-surveillance.
1000 Genomes: The genomes of more than a thousand anonymous participants from a number of different ethnic groups were analyzed and made publicly available.
BioServers: Easy to use interface for DNA database searches
DNA Data Bank of Japan (DDBJ)
Online Mendelian Inheritance in Man (OMIM): inherited diseases
EggNOG Database: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. It provides multiple sequence alignments and maximum-likelihood trees, as well as broad functional annotations
Ensembl: provides automatic annotation databases for human, mouse, other vertebrate and eukaryotic genomes
FlyBase: genome of the model organism Drosophila melanogaster
Personal Genome Project: human genomes of 100,000 volunteers from around the world
USDA nutrient database: Contains a complete nutrition profile for various food and drink items
PHI-base: pathogen-host interaction database. It links gene information to phenotypic information from microbial pathogens on their hosts
Human NIH Data Sharing Repositories: List of NIH-supported data repositories and resources that aggregate information about biomedical data. Each entry has a brief description of the repository and links to data submission and access policies
Human Protein Atlas (HPA): Expression profiles of human protein coding genes both on mRNA and protein level in tissues, cells, subcellular compartments, and cancer tumors
Database of Interacting Proteins (Univ. of California)
InterPro: Classifies proteins into families and predicts the presence of domains and sites
MobiDB: Database of intrinsic protein disorder annotation
Pfam: Protein families database of alignments
PROSITE: Database of protein families and domains
National Center for Biotechnology Information (NCBI): Protein sequence and knowledge base
Universal Protein Resource (UniProt): A collaboration between the European Bioinformatics Institute, the SIB Swiss Institute of Bioinformatics and Protein Information Resource, provides high-quality, freely accessible protein sequence and functional information
You can find DNA sequences, amino acid sequences, SNPs (single nucleotide polymorphisms). genes, and other related databases in the links below. Most are from the National Center for Biotechnology Information, part of the U.S. National Library of Medicine.
The Cornell Lab of Ornithology is a leader in the study, appreciation, and conservation of birds. Through their programs they aim to advance the understanding of nature and to engage people of all ages in learning about birds and protecting the planet. They host the eBird databse, in collaboration with organizations, regional experts, and users ("eBirders") all over the world.
eBird is the world’s largest biodiversity-related citizen science project, with more than 100 million bird sightings contributed each year by eBirders around the world. eBird data document bird distribution, abundance, habitat use, and trends through checklist data collected within a simple, scientific framework. Birders enter when, where, and how they went birding, and then fill out a checklist of all the birds seen and heard during the outing. Access this database by creating an account with a username and password. eBird includes population data from The Great Backyard Bird Count, maps of citizen-created bird habitat from Habitat Network. bird songs and calls from Macaulay Library, nest camera data from NestWatch, and sightings at bird feeders from Project FeederWatch. These citizen science projects at the Cornell Lab of Ornithology provide a way for people to learn about birds, habitat, science, and conservation while contributing to real scientific studies.
Another resource available on the Cornell Lab of Ornithology website:
Google's Dataset Search platform enables users to find datasets stored across the Web through a simple keyword search. The tool surfaces information about datasets hosted in thousands of repositories across the Web, making these datasets universally accessible and useful.
Google believes that this project will have the additional benefits of a) creating a data sharing ecosystem that will encourage data publishers to follow best practices for data storage and publication and b) giving scientists a way to show the impact of their work through citation of datasets that they have produced.
As more dataset repositories use schema.org and similar standards to describe their datasets, the variety and coverage of datasets that users find in Dataset Search, will continue to grow.
Datasets that can be accessed on this page include: