Awesome Public Datasets
收藏数据集概述
Agriculture
-
U.S. Department of Agricultures Nutrient Database
- URL: https://www.ars.usda.gov/northeast-area/beltsville-md/beltsville-human-nutrition-research-center/nutrient-data-laboratory/docs/sr28-download-files/
- Description: Nutrient data from the USDA.
-
U.S. Department of Agricultures PLANTS Database
- URL: http://www.plants.usda.gov/dl_all.html
- Description: Information about plants in the U.S.
Biology
-
1000 Genomes
- URL: http://www.1000genomes.org/data
- Description: Genomic data from 1000 genomes project.
-
American Gut (Microbiome Project)
- URL: https://github.com/biocore/American-Gut
- Description: Microbiome data from American Gut project.
-
Broad Bioimage Benchmark Collection (BBBC)
- URL: https://www.broadinstitute.org/bbbc
- Description: Bioimage data from Broad Institute.
-
Broad Cancer Cell Line Encyclopedia (CCLE)
- URL: http://www.broadinstitute.org/ccle/home
- Description: Cancer cell line data from Broad Institute.
-
Cell Image Library
- URL: http://www.cellimagelibrary.org
- Description: Public repository of cellular images.
-
Complete Genomics Public Data
- URL: http://www.completegenomics.com/public-data/69-genomes/
- Description: Public genomic data from Complete Genomics.
-
EBI ArrayExpress
- URL: http://www.ebi.ac.uk/arrayexpress/
- Description: Gene expression and molecular abundance data.
-
EBI Protein Data Bank in Europe
- URL: http://www.ebi.ac.uk/pdbe/emdb/index.html/
- Description: Protein structure data in Europe.
-
ENCODE project
- URL: https://www.encodeproject.org
- Description: Encyclopedia of DNA elements.
-
Electron Microscopy Pilot Image Archive (EMPIAR)
- URL: http://www.ebi.ac.uk/pdbe/emdb/empiar/
- Description: Archive for raw EM data.
-
Ensembl Genomes
- URL: http://ensemblgenomes.org/info/genomes
- Description: Genomic data for non-vertebrate species.
-
Gene Expression Omnibus (GEO)
- URL: http://www.ncbi.nlm.nih.gov/geo/
- Description: Gene expression data repository.
-
Gene Ontology (GO)
- URL: http://geneontology.org/page/download-annotations
- Description: Ontology for gene functions.
-
Global Biotic Interactions (GloBI)
- URL: https://github.com/jhpoelen/eol-globi-data/wiki#accessing-species-interaction-data
- Description: Species interaction data.
-
Harvard Medical School (HMS) LINCS Project
- URL: http://lincs.hms.harvard.edu
- Description: Library of Network-Based Cellular Signatures.
-
Human Genome Diversity Project
- URL: http://www.hagsc.org/hgdp/files.html
- Description: Genetic diversity data.
-
Human Microbiome Project (HMP)
- URL: http://www.hmpdacc.org/reference_genomes/reference_genomes.php
- Description: Human microbiome data.
-
ICOS PSP Benchmark
- URL: http://ico2s.org/datasets/psp_benchmark.html
- Description: Benchmark data for process-structure-function relationships.
-
International HapMap Project
- URL: http://hapmap.ncbi.nlm.nih.gov/downloads/index.html.en
- Description: Haplotype map of the human genome.
-
Journal of Cell Biology DataViewer
- URL: http://jcb-dataviewer.rupress.org
- Description: Data viewer for cell biology.
-
KEGG
- URL: http://www.genome.jp/kegg/
- Description: Database resource for understanding high-level functions and utilities of the biological system.
-
MIT Cancer Genomics Data
- URL: http://www.broadinstitute.org/cgi-bin/cancer/datasets.cgi
- Description: Cancer genomics data from MIT.
-
NCBI Proteins
- URL: http://www.ncbi.nlm.nih.gov/guide/proteins/#databases
- Description: Protein data from NCBI.
-
NCBI Taxonomy
- URL: http://www.ncbi.nlm.nih.gov/taxonomy
- Description: Taxonomic data from NCBI.
-
NCI Genomic Data Commons
- URL: https://gdc-portal.nci.nih.gov
- Description: Genomic data from the National Cancer Institute.
-
NIH Microarray data
- URL: http://bit.do/VVW6
- Description: Microarray data from NIH.
-
OpenSNP genotypes data
- URL: https://opensnp.org/
- Description: Public genotype data.
-
Pathguid - Protein-Protein Interactions Catalog
- URL: http://www.pathguide.org/
- Description: Catalog of protein-protein interactions.
-
Protein Data Bank
- URL: http://www.rcsb.org/
- Description: Repository of 3D structural data of large biological molecules.
-
Psychiatric Genomics Consortium
- URL: https://www.med.unc.edu/pgc/downloads
- Description: Genomic data related to psychiatric disorders.
-
PubChem Project
- URL: https://pubchem.ncbi.nlm.nih.gov/
- Description: Chemical compound and drug data.
-
PubGene (now Coremine Medical)
- URL: http://www.pubgene.org/
- Description: Gene-related data.
-
Sanger Catalogue of Somatic Mutations in Cancer (COSMIC)
- URL: http://cancer.sanger.ac.uk/cosmic
- Description: Catalog of somatic mutations in cancer.
-
Sanger Genomics of Drug Sensitivity in Cancer Project (GDSC)
- URL: http://www.cancerrxgene.org/
- Description: Genomic data on drug sensitivity in cancer.
-
Sequence Read Archive(SRA)
- URL: http://www.ncbi.nlm.nih.gov/Traces/sra/
- Description: Repository for high-throughput sequencing data.
-
Stanford Microarray Data
- URL: http://smd.stanford.edu/
- Description: Microarray data from Stanford.
-
Stowers Institute Original Data Repository
- URL: http://www.stowers.org/research/publications/odr
- Description: Original data from the Stowers Institute.
-
Systems Science of Biological Dynamics (SSBD) Database
- URL: http://ssbd.qbic.riken.jp
- Description: Database for biological dynamics.
-
The Cancer Genome Atlas (TCGA), available via Broad GDAC
- URL: https://gdac.broadinstitute.org/
- Description: Comprehensive and coordinated efforts to decipher the molecular




