When old metagenomic data meet newly sequenced genomes, a case study
收藏NIAID Data Ecosystem2026-03-10 收录
下载链接:
https://figshare.com/articles/dataset/When_old_metagenomic_data_meet_newly_sequenced_genomes_a_case_study/6532187
下载链接
链接失效反馈官方服务:
资源简介:
Dozens of computational methods are developed to identify species present in a metagenomic dataset. Many of these computational methods depend on available sequenced microbial species, which are still far from being representative. To see how newly sequenced genomes affect the analysis results, we re-analyzed a shotgun metagenomic dataset composed of twelve colitis free metagenomic samples and ten colitis-related metagenomic samples. Unexpectedly, we identified at least two new phyla that may relate to colitis development in patients, together with the phylum identified previously. Compared with the previously identified phylum that differed between the two types of samples, the differences associated with the two new phyla are statistically more significant. Moreover, the abundance of the two new phyla correlates more with the severity of colitis. Surprisingly, even by repeating the analyses implemented in the previous study, we found that at least one main conclusion in the previous study is not supported. Our study indicates the importance of re-analysis of the generated metagenomic datasets and the necessity of considering multiple updated tools in metagenomic studies. It also sheds light on the limitations of the popular tools used currently and the importance to infer the presence of taxa without relying upon available sequenced genomes.
现已开发出数十种计算方法,用于识别宏基因组(metagenomic)数据集内存在的物种。此类计算方法大多依赖于已测序的微生物物种,但现有已测序物种仍远不足以覆盖菌群多样性,代表性严重不足。为探究新测序基因组对分析结果的影响,我们对一组鸟枪法宏基因组(shotgun metagenomic)数据集进行了重分析,该数据集包含12份无结肠炎宏基因组样本与10份结肠炎相关宏基因组样本。出乎意料的是,除此前已报道的菌门外,我们还鉴定出至少2个可能与患者结肠炎发病相关的新菌门。与此前报道的两类样本间存在丰度差异的菌门相比,这两个新菌门所关联的样本差异具有更显著的统计学意义。此外,这两个新菌门的相对丰度与结肠炎的严重程度相关性更强。令人意外的是,即便复刻此前研究中采用的分析流程,我们仍发现原研究中至少1项核心结论未得到数据支持。本研究凸显了对已生成宏基因组数据集进行重分析的重要性,以及在宏基因组研究中纳入多种更新版分析工具的必要性。本研究同时揭示了当前主流分析工具的局限性,以及不依赖现有已测序基因组来推断分类单元(taxa)存在与否的重要性。
创建时间:
2018-06-14



