five

MetaGenomic Species (MGS:236) from Distal Human Gut Microbiota (MetaHit), Sample O2.UC37-2. Streptococcus thermophilus CAG:236

收藏
NIAID Data Ecosystem2026-03-07 收录
下载链接:
https://www.ncbi.nlm.nih.gov/bioproject/PRJEB775
下载链接
链接失效反馈
官方服务:
资源简介:
Metagenomic data acquired by deep sequencing is immensely complex, lacks apparent structure and is typically dominated by unknown species. Using an abundance co-variance strategy, we group highly co-varying genes into MetaGenomic Species, which represent a wide range of biological entities: bacterial genomes, plasmids, genomic islands, clonal variation and bacteriophages. Applying this concept to a new 3.9 million microbial gene catalogue derived from 396 human stool samples we identified 7,381 such MetaGenomic Species. They range in size from 3 to 6,319 genes, with 741 MetaGenomic Species resembling bacterial genomes in number of genes contained. The Meta-Genomic Species displays remarkable consistency in taxonomy and GC content. 247 of the MetaGenomic Species assemblies even pass the HMP high quality draft genome criteria. A large proportion (73%) of the MetaGenomic Species displays no sequence similarity to any previously sequenced organism. Smaller MetaGenomic Species are enriched for genes characteristic for bacteriophages and functions important for biotic interactions and show strong dependencies to gene-rich MetaGenomic Species. We present the first unsupervised structuring of a highly complex series of metagenomic samples into biological entities, including a global analysis of the genetic interdependencies between bacteria, plasmids, phages and genetic islands in the human distal gut.

通过深度测序获得的宏基因组数据(Metagenomic data)复杂度极高,缺乏明确的结构特征,且样本中绝大多数为未知物种。本研究采用丰度共变异策略(abundance co-variance strategy),将高度共变异的基因聚类为宏基因组物种(MetaGenomic Species),这类单元涵盖了多种生物实体:细菌基因组、质粒、基因组岛、克隆变异体以及噬菌体(bacteriophages)。将该方法应用于由396份人类粪便样本构建的全新390万微生物基因目录,我们共鉴定出7381个此类宏基因组物种。这些宏基因组物种的基因数量范围为3至6319个,其中741个宏基因组物种的所含基因数与细菌基因组的基因规模相近。宏基因组物种在分类学属性与GC含量上呈现出显著的一致性。其中247个宏基因组物种的组装结果甚至满足人类微生物组计划(Human Microbiome Project,HMP)的高质量草图基因组标准。高达73%的宏基因组物种与所有已测序生物均未检测到序列相似性。小型宏基因组物种显著富集噬菌体特征基因以及参与生物互作的关键功能基因,且与基因丰富度较高的宏基因组物种存在较强的依赖关联。本研究首次实现了将高度复杂的宏基因组样本集进行无监督聚类,将其划分为不同生物实体,并完成了人类远端肠道内细菌、质粒、噬菌体与基因组岛之间遗传互作关系的全局性分析。
创建时间:
2013-07-16
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作