five

A comparison of differential DNA methylation analysis methods for continuous outcomes: implications for epigenetic studies

收藏
Figshare2026-03-19 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/A_comparison_of_differential_DNA_methylation_analysis_methods_for_continuous_outcomes_implications_for_epigenetic_studies/31813902
下载链接
链接失效反馈
官方服务:
资源简介:
Univariate methods are widely employed in epigenome-wide association studies to identify CpGs associated with phenotypic traits. However, their performance has not been thoroughly evaluated. We compared commonly used methods- limma, Spearman’s correlation (SC), and quantile regression (QR)- for analysis of methylation changes with gestational and individual age across multiple cohorts. The comparison was based on reproducibility, genomic location distribution, and predictive accuracy of CpGs identified as differentially methylated. Limma identified more consistent gestational age-associated CpGs (n = 1,846) than SC (n = 1,409; p = 3.25e-15) and QR (n = 1,145; p p p The findings of this study indicate that the choice of differential methylation analysis method impacts CpG-level reproducibility, within chromosome co-location, and predictive accuracy. Overall, limma offers a strong balance of reproducibility and predictive value. DNA methylation is a modification of the DNA which influences and reflects how cells respond to different biological or environmental factors. These modifications usually occur in special positions in the DNA known as CpG sites. Studies that look at these sites across the entire genome, called epigenome-wide association studies (EWAS), often test one site at a time (known as univariate analysis) to find CpG sites that are linked to various biological phenotypes. However, it is unclear how well different univariate methods perform in these studies. In this work, we compared three common statistical methods – limma, Spearman’s correlation, and quantile regression – using data from several cohorts. We examined how reproducible the results were, how the identified CpG sites were distributed across the chromosomes and key regulatory regions, and how well these CpG sites can predict gestational and individual age. Our results suggest that limma analysis detects CpG sites more consistently across studies and leads to higher prediction accuracy than the other methods. CpG sites identified by limma and Spearman’s correlation also tend to be co-located closer together within chromosomes. These patterns were similar when analyzing both gestational and individual age.

单变量分析方法(univariate methods)在表观全基因组关联研究(epigenome-wide association studies, EWAS)中被广泛应用,用于筛选与表型性状相关的CpG位点(CpGs)。然而,这类方法的性能尚未得到充分评估。本研究针对多个队列中与胎龄和个体年龄相关的甲基化变化分析,对比了三种常用方法:limma、斯皮尔曼相关性分析(Spearman’s correlation, SC)以及分位数回归(quantile regression, QR)。对比维度包括鉴定所得差异甲基化CpG位点的可重复性、基因组位置分布以及预测准确率。结果显示,limma鉴定出的胎龄关联CpG位点一致性更强(n=1846),优于斯皮尔曼相关性分析(n=1409;p=3.25e-15)与分位数回归(n=1145;p p p)。本研究结果表明,差异甲基化分析方法的选择会影响CpG水平的可重复性、染色体共定位特性以及预测准确率。总体而言,limma在可重复性与预测价值之间实现了良好平衡。DNA甲基化(DNA methylation)是DNA的一种修饰形式,可影响并反映细胞对不同生物或环境因素的响应。这类修饰通常发生在DNA的特定位置,即CpG位点。全基因组范围内针对此类位点开展的研究被称为表观全基因组关联研究(EWAS),这类研究通常采用一次测试一个位点的单变量分析(univariate analysis)策略,以筛选与各类生物表型相关的CpG位点。但目前学界尚不清楚不同单变量方法在这类研究中的实际表现优劣。本研究依托多个队列的数据,对比了三种常用统计方法——limma、斯皮尔曼相关性分析与分位数回归。我们从三个维度展开评估:一是所得结果的可重复性,二是鉴定出的CpG位点在染色体与关键调控区域的分布情况,三是这些CpG位点预测胎龄和个体年龄的能力。研究结果表明,相较于其余两种方法,limma分析能够更稳定地跨研究检测CpG位点,且可获得更高的预测准确率。此外,limma与斯皮尔曼相关性分析所鉴定出的CpG位点,在染色体上的共定位距离也更为紧密。上述分析模式在针对胎龄与个体年龄的研究中均保持一致。
创建时间:
2026-03-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作