Additional file 24 of Proteobacteria explain significant functional variability in the human gut microbiome
收藏DataCite Commons2024-12-18 更新2024-07-25 收录
下载链接:
https://springernature.figshare.com/articles/dataset/Additional_file_24_of_Proteobacteria_explain_significant_functional_variability_in_the_human_gut_microbiome/4780264
下载链接
链接失效反馈官方服务:
资源简介:
Source data for figures. Figure 1, source data 1: matrix of read counts (after rarefaction) for every gene family in each sample included in the present study. Figure 1, source data 2: matrix of average family lengths for every gene family in each sample included in the present study. Figure 1, source data 3: log-RPKG abundances for every gene family mapped in the present study. Figure 2, source data 1: residual log-RPKG abundances (i.e., after fitting the linear model) for every gene family mapped in the present study. Figure 3, source data 1: counts of invariable, non-significant, and variable gene families per pathway. “Strong,” “medium,” and “weak” refer to FDR cutoffs of 0.05, 0.10, and 0.25, respectively. Figure 3, source data 2: counts of invariable, non-significant, and variable gene families for ribosomes in each domain of life. Figure 4, source data 1: residual log-RPKG scaled by the expected variance under the null model (see the “Methods” section). Figure 6, source data 1: log10 phylogenetic distribution (PD), log10 residual variance statistics (residvar), significance at 5% FDR (invariable coded as “dn”, variable coded as “up”, non-significant coded as “ns”), presence in at least one bacterial/archaeal genome in KEGG, and annotations for all measured gene families. Figure 6, source data 2: counts of significant associations of invariable, non-significant, and variable gene families with taxonomic summary statistics. Figure 7, source data 1: counts of significant associations of invariable, non-significant, and variable gene families with phylum-level abundances. Figure 8, source data 1: q values for gene families in the overall test. Figure 8, source data 2: q values for gene families in phylum-specific tests. Figure 8, source data 3: JSON-formatted lists of significantly (in)variable or non-significant gene families at 5% (“strong”), 10% (“med”), and 25% FDR (“weak”); overall test. Figure 8, source data 4: JSON-formatted lists of significantly (in)variable or non-significant gene families at 5% (“strong”), 10% (“med”), and 25% FDR (“weak”); phylum-specific tests. (BZ 51464 kb)
本研究各图表的源数据如下:图1 源数据1:本研究纳入的所有样本中,每一类基因家族的测序读段计数(经稀疏化处理后)矩阵;图1 源数据2:本研究纳入的所有样本中,每一类基因家族的平均长度矩阵;图1 源数据3:本研究比对得到的所有基因家族的log-RPKG丰度值;图2 源数据1:本研究比对得到的所有基因家族的残差log-RPKG丰度(即经线性模型拟合后得到的残差);图3 源数据1:每条通路下恒定、无显著性及可变基因家族的数量,其中“强”“中”“弱”分别对应错误发现率(FDR)阈值0.05、0.10及0.25;图3 源数据2:各生物域中核糖体相关恒定、无显著性及可变基因家族的数量;图4 源数据1:经零假设模型下的期望方差标准化后的残差log-RPKG值(详见“方法”部分);图6 源数据1:所有被测基因家族的log10转换系统发育分布(PD)、log10转换残差方差统计量(residvar)、5% FDR阈值下的显著性情况(恒定基因家族编码为“dn”,可变基因家族编码为“up”,无显著性基因家族编码为“ns”)、京都基因与基因组百科全书(KEGG)中至少存在于1个细菌/古菌基因组的情况,以及功能注释信息;图6 源数据2:恒定、无显著性及可变基因家族与分类学汇总统计量间存在显著关联的数量;图7 源数据1:恒定、无显著性及可变基因家族与门水平丰度间存在显著关联的数量;图8 源数据1:全局检验中所有基因家族的q值;图8 源数据2:门水平特异性检验中所有基因家族的q值;图8 源数据3:全局检验下,在5%(“强”)、10%(“中”)及25%(“弱”)FDR阈值下显著(非)可变或无显著性的基因家族的JSON格式列表;图8 源数据4:门水平特异性检验下,在5%(“强”)、10%(“中”)及25%(“弱”)FDR阈值下显著(非)可变或无显著性的基因家族的JSON格式列表。(BZ 51464 kb)
提供机构:
Figshare
创建时间:
2017-03-23



