five

Microeukaryotic Protein Database (VDB_Microeukaryotic_v1)

收藏
NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://figshare.com/articles/dataset/Microeukaryotic_Protein_Database/19668855
下载链接
链接失效反馈
官方服务:
资源简介:
Version: VDB_Microeukaryotic_v1 Contains 4 files:  -rw-r--r--  1 jespinoz  staff    10G Apr 18 19:46 reference.rmdup.iupac.relabeled.no_deprecated.complete_lineage.faa.gz -rw-r--r--  1 jespinoz  staff   167M Apr 18 19:40 target_to_source.dict.pkl.gz -rw-r--r--  1 jespinoz  staff   605K Apr 18 19:40 source_to_lineage.dict.pkl.gz -rw-r--r--  1 jespinoz  staff   542K Apr 18 19:42 source_taxonomy.tsv.gz * The main fasta protein file which is the dereplicated combination of NR (only protista and fungus), MMETSP, EukZoo, and EukProt.  Only complete lineages are included since this is partially used for classification.  * .pkl.gz are Python gzipped pickled dictionaries.  * target_to_source.dict.pkl.gz has mapping between identifiers in fasta file and the original source * source_to_lineage.dict.pkl.gz has the mapping between source identifiers and lineage strings (e.g., c__Aconoidasida;o__Haemosporida;f__Haemoproteidae;g__Haemoproteus;s__Haemoproteus sp. hCWT4) * source_taxonomy.tsv.gz has the taxonomy for each source identifier Citation: * Espinoza, J.L., Dupont, C.L. VEBA: a modular end-to-end suite for in silico recovery, clustering, and analysis of prokaryotic, microeukaryotic, and viral genomes from metagenomes. BMC Bioinformatics 23, 419 (2022). https://doi.org/10.1186/s12859-022-04973-8 * Espinoza, Josh (2022): Microeukaryotic Protein Database. figshare. Dataset. https://doi.org/10.6084/m9.figshare.19668855.v1
创建时间:
2022-07-07
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作