five

When old metagenomic data meet newly sequenced genomes, a case study

收藏
Figshare2018-06-14 更新2026-04-29 收录
下载链接:
https://figshare.com/articles/dataset/When_old_metagenomic_data_meet_newly_sequenced_genomes_a_case_study/6532187
下载链接
链接失效反馈
官方服务:
资源简介:
Dozens of computational methods are developed to identify species present in a metagenomic dataset. Many of these computational methods depend on available sequenced microbial species, which are still far from being representative. To see how newly sequenced genomes affect the analysis results, we re-analyzed a shotgun metagenomic dataset composed of twelve colitis free metagenomic samples and ten colitis-related metagenomic samples. Unexpectedly, we identified at least two new phyla that may relate to colitis development in patients, together with the phylum identified previously. Compared with the previously identified phylum that differed between the two types of samples, the differences associated with the two new phyla are statistically more significant. Moreover, the abundance of the two new phyla correlates more with the severity of colitis. Surprisingly, even by repeating the analyses implemented in the previous study, we found that at least one main conclusion in the previous study is not supported. Our study indicates the importance of re-analysis of the generated metagenomic datasets and the necessity of considering multiple updated tools in metagenomic studies. It also sheds light on the limitations of the popular tools used currently and the importance to infer the presence of taxa without relying upon available sequenced genomes.

已有数十种计算方法被开发,用于识别宏基因组数据集(metagenomic dataset)中存在的物种。其中多数计算方法依赖于已测序的微生物物种,但当前已测序的微生物物种仍远未具备代表性。为探究新测序的基因组如何影响分析结果,我们对一组鸟枪法宏基因组数据集(shotgun metagenomic dataset)进行了重分析,该数据集包含12份无结肠炎宏基因组样本与10份结肠炎相关宏基因组样本。出乎预料的是,除此前已鉴定出的菌门(phylum)外,我们还发现了至少2个可能与患者结肠炎发病相关的新菌门。与此前在两类样本间存在丰度差异的已知菌门相比,这两个新菌门所关联的组间差异具有更显著的统计学意义。此外,这两个新菌门的相对丰度与结肠炎的严重程度相关性更强。令人意外的是,即便我们复刻了此前研究中采用的全套分析流程,仍发现此前研究中至少1项核心结论并未得到支持。本研究凸显了对已生成宏基因组数据集进行重分析的重要性,以及在宏基因组研究中选用多种更新版分析工具的必要性。同时,本研究也揭示了当前主流分析工具的局限性,以及不依赖已测序基因组来推断分类单元(taxa)存在与否的重要性。
创建时间:
2018-06-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作