five

Large-Scale Hypothesis Testing for Causal Mediation Effects with Applications in Genome-wide Epigenetic Studies

收藏
DataCite Commons2021-09-29 更新2024-07-28 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Large-Scale_Hypothesis_Testing_for_Causal_Mediation_Effects_with_Applications_in_Genome-wide_Epigenetic_Studies/14396239/1
下载链接
链接失效反馈
官方服务:
资源简介:
In genome-wide epigenetic studies, it is of great scientific interest to assess whether the effect of an exposure on a clinical outcome is mediated through DNA methylations. However, statistical inference for causal mediation effects is challenged by the fact that one needs to test a large number of composite null hypotheses across the whole epigenome. Two popular tests, the Wald-type Sobel’s test and the joint significant test using the traditional null distribution are underpowered and thus can miss important scientific discoveries. In this paper, we show that the null distribution of Sobel’s test is not the standard normal distribution and the null distribution of the joint significant test is not uniform under the composite null of no mediation effect, especially in finite samples and under the singular point null case that the exposure has no effect on the mediator and the mediator has no effect on the outcome. Our results explain why these two tests are underpowered, and more importantly motivate us to develop a more powerful Divide-Aggregate Composite-null Test (DACT) for the composite null hypothesis of no mediation effect by leveraging epigenome-wide data. We adopted Efron’s empirical null framework for assessing statistical significance of the DACT test. We showed analytically that the proposed DACT method had improved power, and could well control type I error rate. Our extensive simulation studies showed that, in finite samples, the DACT method properly controlled the type I error rate and outperformed Sobel’s test and the joint significance test for detecting mediation effects. We applied the DACT method to the US Department of Veterans Affairs Normative Aging Study, an ongoing prospective cohort study which included men who were aged 21 to 80 years at entry. We identified multiple DNA methylation CpG sites that might mediate the effect of smoking on lung function with effect sizes ranging from –0.18 to –0.79 and false discovery rate controlled at level 0.05, including the CpG sites in the genes AHRR and F2RL3. Our sensitivity analysis found small residual correlations (less than 0.01) of the error terms between the outcome and mediator regressions, suggesting that our results are robust to unmeasured confounding factors.

在全基因组表观遗传学研究中,评估暴露因素对临床结局的影响是否通过DNA甲基化(DNA methylation)介导,具有重要的科学研究价值。然而,因果中介效应的统计推断面临一大核心挑战:需要在整个表观基因组范围内对大量复合原假设进行检验。当前两种广为使用的检验方法——基于沃尔德(Wald)的索贝尔(Sobel)检验,以及采用传统原分布的联合显著性检验,均存在检验效能偏低的问题,可能会遗漏重要的科研发现。 本文研究表明,在“无中介效应”的复合原假设下,索贝尔检验的原分布并非标准正态分布,联合显著性检验的原分布亦非均匀分布,这一偏差在有限样本场景以及“暴露因素对中介变量无影响、中介变量对结局无影响”的单点原假设情形下尤为显著。上述结果解释了这两种检验方法效能低下的原因,更重要的是,启发我们依托全表观基因组数据,开发出检验效能更强的分割聚合复合原假设检验(Divide-Aggregate Composite-null Test, DACT),用于检验无中介效应的复合原假设。我们采用埃夫隆(Efron)提出的经验原假设框架来评估DACT检验的统计学显著性。 通过理论分析证明,所提出的DACT方法不仅提升了检验效能,还能精准控制一类错误率。大规模模拟研究结果显示,在有限样本场景下,DACT方法能够合理管控一类错误率,在检测中介效应方面的表现优于索贝尔检验与联合显著性检验。 我们将DACT方法应用于美国退伍军人事务部规范老化研究(US Department of Veterans Affairs Normative Aging Study)——一项持续进行的前瞻性队列研究,其入组对象为基线年龄21至80岁的男性。研究识别出多个可能介导吸烟对肺功能影响的DNA甲基化CpG位点(CpG sites),效应量范围为-0.18至-0.79,且将错误发现率控制在0.05的水平,其中包括位于AHRR与F2RL3基因内的CpG位点。 敏感性分析结果显示,结局与中介变量回归模型的残差项相关性极小(小于0.01),表明本研究结果对未测量混杂因素具备良好的稳健性。
提供机构:
Taylor & Francis
创建时间:
2021-04-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作