agnostosDB_dbf02445-20200519
收藏DataCite Commons2020-08-25 更新2024-08-17 收录
下载链接:
https://figshare.com/articles/The_agnostosDB/12459056/2
下载链接
链接失效反馈官方服务:
资源简介:
The agnostosDB (dbf02445-20200519) is a comprehensive dataset of microbial gene clusters (GCs) from genomes and metagenomes. It contains 5,287,759 GCs and more than 280M genes coming from the bacterial and archaeal Genome Taxonomy Database (GTDB) genomes, and from five large-scale metagenomic projects: 583 marine metagenomes from Tara Oceans expedition (TARA), Malaspina expedition, Ocean Sampling Day (OSD), Global Ocean Sampling Expedition (GOS), complemented with 1,246 metagenomes from the Human Microbiome Project (HMP) phase I and II. The dataset is described in Vanni et al. 2020.<br>Additional and more detailed information about the dataset creation and some of its applications, can be found at https://dark.metagenomics.eu/.Related to the agnostosDB is the agnostos-wf, a snakemake workflow stored in the GitHub repository https://github.com/functional-dark-side/agnostos-wf. The agnostos-wf allows to search the agnostosDB gene cluster HMM profiles and/or to integrate new sequence data (genes/contigs) in it.
agnosticosDB(编号dbf02445-20200519)是一套涵盖基因组与宏基因组来源微生物基因簇(gene clusters,简称GCs)的综合性数据集。该数据集共包含5287759个基因簇与逾2.8亿个基因,数据来源涵盖两部分:一是细菌与古菌基因组分类数据库(Genome Taxonomy Database,简称GTDB)的基因组数据;二是五大大型宏基因组项目的样本数据,其中包括塔拉海洋科考(Tara Oceans expedition,简称TARA)、马拉斯皮纳科考(Malaspina expedition)、海洋采样日(Ocean Sampling Day,简称OSD)与全球海洋采样科考(Global Ocean Sampling Expedition,简称GOS)产出的583份海洋宏基因组样本,另补充了人类微生物组计划(Human Microbiome Project,简称HMP)第一、二阶段的1246份宏基因组样本。该数据集的相关研究成果已发表于Vanni等人2020年的学术文献中。
有关该数据集构建流程及部分应用场景的更多详细信息,可访问https://dark.metagenomics.eu/ 查询。与agnosticosDB配套的agnosticos-wf是一套存储于GitHub仓库https://github.com/functional-dark-side/agnostos-wf的Snakemake工作流。该工作流可用于检索agnosticosDB的基因簇隐马尔可夫模型(Hidden Markov Model,简称HMM)谱,同时支持将新的序列数据(基因/重叠群)整合至该数据集内。
提供机构:
figshare
创建时间:
2020-07-09



