five

A comparison of differential DNA methylation analysis methods for continuous outcomes: implications for epigenetic studies

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://figshare.com/articles/dataset/A_comparison_of_differential_DNA_methylation_analysis_methods_for_continuous_outcomes_implications_for_epigenetic_studies/31813902
下载链接
链接失效反馈
官方服务:
资源简介:
Univariate methods are widely employed in epigenome-wide association studies to identify CpGs associated with phenotypic traits. However, their performance has not been thoroughly evaluated. We compared commonly used methods- limma, Spearman’s correlation (SC), and quantile regression (QR)- for analysis of methylation changes with gestational and individual age across multiple cohorts. The comparison was based on reproducibility, genomic location distribution, and predictive accuracy of CpGs identified as differentially methylated. Limma identified more consistent gestational age-associated CpGs (n = 1,846) than SC (n = 1,409; p = 3.25e-15) and QR (n = 1,145; p < 2.2e-16). CpGs selected by limma and SC were more clustered within chromosomes than QR, as determined by nearest neighbor index analysis (p < 0.05). For gestational age prediction using top 100 features, random forest and elastic net yielded more accurate models (lower RMSE) for limma and SC compared to QR (all p < 0.05). With top 10,000 features, random forest again favored limma and SC over QR, while elastic net performed comparably across methods. Similar results were obtained for analysis of individual age. The findings of this study indicate that the choice of differential methylation analysis method impacts CpG-level reproducibility, within chromosome co-location, and predictive accuracy. Overall, limma offers a strong balance of reproducibility and predictive value. DNA methylation is a modification of the DNA which influences and reflects how cells respond to different biological or environmental factors. These modifications usually occur in special positions in the DNA known as CpG sites. Studies that look at these sites across the entire genome, called epigenome-wide association studies (EWAS), often test one site at a time (known as univariate analysis) to find CpG sites that are linked to various biological phenotypes. However, it is unclear how well different univariate methods perform in these studies. In this work, we compared three common statistical methods – limma, Spearman’s correlation, and quantile regression – using data from several cohorts. We examined how reproducible the results were, how the identified CpG sites were distributed across the chromosomes and key regulatory regions, and how well these CpG sites can predict gestational and individual age. Our results suggest that limma analysis detects CpG sites more consistently across studies and leads to higher prediction accuracy than the other methods. CpG sites identified by limma and Spearman’s correlation also tend to be co-located closer together within chromosomes. These patterns were similar when analyzing both gestational and individual age.
创建时间:
2026-03-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作