five

Microeukaryotic Protein Database

收藏
Figshare2022-12-21 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/Microeukaryotic_Protein_Database/19668855/1
下载链接
链接失效反馈
官方服务:
资源简介:
Version: VDB_Microeukaryotic_v1 <br> Contains 4 files: -rw-r--r-- 1 jespinoz staff 10G Apr 18 19:46 reference.rmdup.iupac.relabeled.no_deprecated.complete_lineage.faa.gz -rw-r--r-- 1 jespinoz staff 167M Apr 18 19:40 target_to_source.dict.pkl.gz -rw-r--r-- 1 jespinoz staff 605K Apr 18 19:40 source_to_lineage.dict.pkl.gz -rw-r--r-- 1 jespinoz staff 542K Apr 18 19:42 source_taxonomy.tsv.gz <br> * The main fasta protein file which is the dereplicated combination of NR (only protista and fungus), MMETSP, EukZoo, and EukProt. Only complete lineages are included since this is partially used for classification. <br> * .pkl.gz are <sub>Python gzipped pickled dictionaries</sub>. * target_to_source.dict.pkl.gz has mapping between identifiers in fasta file and the original source * source_to_lineage.dict.pkl.gz has the mapping between source identifiers and lineage strings (e.g., c__Aconoidasida;o__Haemosporida;f__Haemoproteidae;g__Haemoproteus;s__Haemoproteus sp. hCWT4) * source_taxonomy.tsv.gz has the taxonomy for each source identifier <br> <strong>Citation:</strong> * Espinoza, J.L., Dupont, C.L. VEBA: a modular end-to-end suite for in silico recovery, clustering, and analysis of prokaryotic, microeukaryotic, and viral genomes from metagenomes. BMC Bioinformatics 23, 419 (2022). https://doi.org/10.1186/s12859-022-04973-8 * Espinoza, Josh (2022): Microeukaryotic Protein Database. figshare. Dataset. https://doi.org/10.6084/m9.figshare.19668855.v1
提供机构:
Espinoza, Josh
创建时间:
2022-07-07
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作