five

Absence of enterotypes in the human gut microbiomes reanalyzed with non-linear dimensionality reduction methods

收藏
DataCite Commons2023-05-16 更新2024-07-29 收录
下载链接:
https://figshare.com/articles/dataset/Absence_of_enterotypes_in_the_human_gut_microbiomes_reanalyzed_with_non-linear_dimensionality_reduction_methods/19091423
下载链接
链接失效反馈
官方服务:
资源简介:
Here we provide data and software for the work "Absence of enterotypes in the human gut microbiomes reanalyzed with non-linear dimensionality reduction methods". <br> Enterotypes of the human gut microbiome have been proposed to be a powerful prognostic tool to evaluate the correlation between lifestyle, nutrition, and disease. However, the number of enterotypes suggested in the literature ranged from two to four. The growth of available metagenome data and the use of exact, non-linear methods of data analysis challenges the very concept of clusters in the multidimensional space of bacterial microbiomes. <br> We demonstrate the presence of a lower-dimensional structure in the microbiome space, with high-dimensional data concentrated near a low-dimensional non-linear submanifold, but the absence of distinct and stable clusters that could represent enterotypes. This observation is robust with regard to diverse combinations of dimensionality reduction techniques and clustering algorithms.<br> We used <strong>16S rRNA </strong>genotype data from the National Institutes of Health <strong>Human Microbiome Project (HMP)</strong> and <strong>American Gut Project (AGP) </strong>presented in <strong>Order, Family, and Genus taxonomic levels (O, F, and G, respectively)</strong>. These largest open-access available datasets provide a sufficient number of data points for correct estimation of the clustering partition and constructing a manifold. We used 4587 HMP samples from stool and rectum body sites downloaded from https://portal.hmpdacc.org/ and 9511 samples from AGP downloaded from https://figshare.com/ as abundance matrices. For comparison with the original research, we also analyzed Sanger, Illumina, and Pyroseq datasets from (http://www.bork.embl.de/Docu/Arumugam_et_al_2011/). <br> All datasets are provided in the `data.zip` and normalized by dividing the Operational Taxonomic Units (OTUs) values by the total sum of abundances for a given data sample.
提供机构:
figshare
创建时间:
2022-01-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作