Proteome database of 36 million proteins from 4,351 species, including marine microbial sequences
收藏NIAID Data Ecosystem2026-03-14 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.4tmpg4ffn
下载链接
链接失效反馈官方服务:
资源简介:
A fasta-formatted database of 36,866,870 predicted proteins representing 4,351 unique species from 117 phyla.
Methods
A database of 36,866,870 predicted proteins representing 4,351 unique species from 117 phyla (see table below) was constructed using the UniProt Reference Proteome (RP) at the 35% co-membership threshold including 4,295 Representative Proteome Groups (RPGs) (Chen et al. 2011) in addition to all taxonomically identifiable transcriptomes of the Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP) (Keeling et al. 2014) that were processed through WinstonCleaner (https://github.com/kolecko007/WinstonCleaner). The database also included proteins inferred from the annotated and assembled genomes of Aurantiochytrium limacinum ATCC MYA-1381, Schizochytrium aggregatum ATCC 28209, and Aplanochytrium kerguelensis PBS07 from the U.S. Department of Energy’s Joint Genome Institute (JGI), all PFAM PF00494 Aurantiochytrium sp. KH105 proteome hits from the Okinawa Institute of Science and Technology Marine Genomics Unit genome browser, all of UniProt's annotated Hondaea fermentalgiana proteins, and the annotated proteins of the breviate Lenisia limosa and associated mutualistic epibionts (Hamann et al. 2016).
Unique phyla are represented in the reference proteome database.
Archaea
Bacteria
Eukaryota
Candidatus
Caudovirales
Crenarchaeota
Euryarchaeota
Hyperthermophilic
Nanoarchaeota
Thaumarchaeota
Abditibacteriota
Acidobacteria
Actinobacteria
Aquificae
Armatimonadetes
bacterium
Bacteroidetes
Balneolaeota
Calditrichaeota
candidate
Candidatus
Chlamydiae
Chlorobi
Chloroflexi
Chrysiogenetes
Coprothermobacterota
Cyanobacteria
Deferribacteres
Deinococcus-Thermus
Dictyoglomi
Elusimicrobia
Fibrobacteres
Firmicutes
Fusobacteria
Gemmatimonadetes
Haloplasmatales
Ignavibacteriae
Kiritimatiellaeota
Lentisphaerae
Natronospirillum
Nitrospinae
Nitrospirae
Planctomycetes
Proteobacteria
Rhodothermaeota
Spirochaetes
Synergistetes
Tenericutes
Thermobaculum
Thermodesulfobacteria
Thermotogae
Vampirococcus
Verrucomicrobia
Annelida
Apicomplexa
Apusomonadidae
Arthropoda
Ascomycota
Bacillariophyta
Basidiomycota
Bigyra
Blastocladiomycota
Bolidophyceae
Brachiopoda
Breviatea
Cercozoa
Chlorophyta
Choanoflagellata
Chordata
Chromeraceae
Chromerida
Chrysophyceae
Chytridiomycota
Ciliophora
Cnidaria
Cryptomycota
Cryptophyta
Dictyochophyceae
Dinophyceae
Discosea
Echinodermata
Endomyxa
Euglenozoa
Evosea
Filasterea
Foraminifera
Fornicata
Glaucocystophyceae
Haptista
Heterolobosea
Ichthyosporea
Microsporidia
Mollusca
Mucoromycota
Nematoda
Oomycetes
Palpitomonas
Parabasalia
Pelagophyceae
Perkinsozoa
Phaeophyceae
Pinguiophyceae
Placozoa
Platyhelminthes
Porifera
Raphidophyceae
Rhodophyta
Rotifera
Rotosphaerida
Stereomyxa
Streptophyta
Synchromophyceae
Synurophyceae
Tardigrada
Tubulinea
Vitrellaceae
Xanthophyceae
Zoopagomycota
References
Chen C, Natale DA, Finn RD, Huang H, Zhang J, Wu CH, Mazumder R. 2011. Representative Proteomes: A Stable, Scalable and Unbiased Proteome Set for Sequence Analysis and Functional Annotation. PLOS ONE. 6(4):e18910.
Keeling PJ, Burki F, Wilcox HM, Allam B, Allen EE, Amaral-Zettler LA, Armbrust EV, Archibald JM, Bharti AK, Bell CJ, et al. 2014. The Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP): illuminating the functional diversity of eukaryotic life in the oceans through transcriptome sequencing. PLoS Biol. 12(6):e1001889.
Hamann E, Gruber-Vodicka H, Kleiner M, Tegetmeyer HE, Riedel D, Littmann S, Chen J, Milucka J, Viehweger B, Becker KW, et al. 2016. Environmental Breviatea harbour mutualistic Arcobacter epibionts. Nature. 534:254–258.
创建时间:
2023-02-27



