five

microbetag : building a thorough database of genome-scale KO annotations

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/6345382
下载链接
链接失效反馈
官方服务:
资源简介:
In this repository we keep internal data for the microbetag microbial co-occurrence network annotator. microbetag makes use of 2-column files for each genome, indicating the KO term found and a KEGG module in which this terms takes part into. As a single KO term might participates in more than one KEGG modules, the same KO might be more than once in an annotation file.  gtdb_modelseed_gems.zip for all the GTDB genomes their corresponding PATRIC annotations were gathered. Then, using modelseedpy we constructed their genome scale metabolic reconstructions gtdb_kofam_scan_per_module.tar.gz all representative genomes of GTDB (v.202) were parsed and their corresponding `.faa` files were retrieved from the NCBI FTP. Then the kofam_scan tool was used to annotate them and finally a manual script was used to keep KOs of each genome per module.  updated_seedsets_of_interest.pckl A pickle file with the seeds of each GEM included in the gtdb_modelseed_gems.zip file and related to the KEGG MODULES based on the seedId_keggId_module.tsv file you can find on microbetag's GitHub page.  Example: PATRIC                                                                                        SeedSet373.172    [cpd00891, cpd00136, cpd00199, cpd01772, cpd00...397278.5   [cpd00891, cpd00136, cpd01772, cpd02698, cpd08... updated_non_seedsets_of_interest.pckl A pickle file with the non seeds of each GEM included in the gtdb_modelseed_gems.zip file and related to the KEGG MODULES based on the seedId_keggId_module.tsv file you can find on microbetag's GitHub page.  Example: PATRIC                                                                                        NonSeedSet64187.548   [cpd00508, cpd00869, cpd00774, cpd03830, cpd00...74426.1719  [cpd00204, cpd00447, cpd20171, cpd03470, cpd00... phen_classes.zip A list of pickle files with the re-trained classes of phenDB for the prediction of functional traits on a genome.
创建时间:
2025-03-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作