five

An assessment of compositional methods for the analysis of DNA methylation-based deconvolution estimates

收藏
Taylor & Francis Group2024-09-20 更新2026-04-16 收录
下载链接:
https://tandf.figshare.com/articles/dataset/An_assessment_of_compositional_methods_for_the_analysis_of_DNA_methylation-based_deconvolution_estimates/27074183/1
下载链接
链接失效反馈
官方服务:
资源简介:
DNA methylation (DNAm)-based deconvolution estimates contain relative data, forming a composition, that standard methods (testing directly on cell proportions) are ill-suited to handle. In this study we examined the performance of an alternative method, analysis of compositions of microbiomes (ANCOM), for the analysis of DNAm-based deconvolution estimates. We performed two different simulation studies comparing ANCOM to a standard approach (two sample <i>t</i>-test performed directly on cell proportions) and analyzed a real-world data from the Women’s Health Initiative to evaluate the applicability of ANCOM to DNAm-based deconvolution estimates. Our findings indicate that ANCOM can effectively account for the compositional nature of DNAm-based deconvolution estimates. ANCOM adequately controls the false discovery rate while maintaining statistical power comparable to that of standard methods. DNA methylation (DNAm)-based deconvolution provides highly accurate estimates of the proportion of each cell type in a mixed-cell type biological sample (e.g., whole-blood). These estimates can be used for examining the association between cell type proportions and biological or clinical end points; for example, comparing the estimated neutrophil proportion in whole blood between smokers and non-smokers. Cell proportion data has unique features which present challenges for traditional and widely used statistical methods. In response to this issue, our work presents two simulation studies and a real-world analysis that benchmark the performance of current standard statistical methods against an alternative method called analysis composition of microbes (ANCOM), which was originally developed for the analysis of microbiome data. In our real-world analysis we used DNAm data collected from Women’s Health Initiative Long Life Study I and compared the results of each method against a gold-standard that is typically not available for these analyses. In each of our simulation studies, ANCOM was able to detect true differences in cell proportions between the groups being compared but had a much lower rate of false discovery compared with the standard statistical methods. Our real-world analysis demonstrated similar findings. Overall, our study highlights the potential of ANCOM as a powerful and robust method for analyzing DNAm-derived deconvolution estimates when the interest is comparisons of cell type proportions and biological or clinical end points. ANCOM’s ability to minimize false discovery while maintaining robust statistical power positions it as a valuable addition to the epigenomic analysis toolkit. DNA methylation (DNAm) based deconvolution estimates contain inherently relative data, forming a composition. Typical analysis methods (e.g., <i>t</i>-tests, linear regression models, etc.) implemented on the cell proportion estimates themselves are ill-suited to handle the nuances and challenges that compositional data present. The purpose of this study was to examine the applicability of an alternative method that was initially developed for microbiome data, ANCOM (<b>An</b>alysis <b>Co</b>mpositions of <b>M</b>icrobiomes), for the analysis of DNAm-based deconvolution estimates. Simulation 1 compared each method’s performance as a function of varying cell counts, effect sizes and study sample sizes. Simulation 2 evaluated each method’s performance where cell counts were simulated based on references ranges from a control population in Kenya. The specific context for this simulation was tests of differential cell type abundance in neutropenic vs non-neutropenic, control patients. Our real-world data analysis presents a comparison of different methods for cell type differential abundance analyses using a dataset from the Women’s Health Initiative that included both DNAm array data in whole-blood and complete blood cell counts. In simulation study 1, we observed that ANCOM was well equipped to handle correlation between cell type deconvolution estimates across several levels of baseline cell abundance and effect sizes, while maintaining strong statistical power to detect differential abundance in cell types that were generated to be different between groups. The results of simulation study 2 were consistent with those of the first and suggest that standard methods are poorly equipped to handle correlation between cell types in differential abundance analyses. Our analysis of real-world data from the Women’s Health Initiative study affirmed the validity of our simulation results. Applying ANCOM to DNAm derived deconvolution proportion estimates yielded consistent results with those generated from the analysis of cell counts. However, the standard approach falsely detected differential abundance in eosinophils when using proportion estimates. In totality, ANCOM performed exceptionally well in controlling false discoveries while maintaining strong statistical power.
提供机构:
Madsen, Tracy E; Liu, Longjian; Salas, Lucas A; Auer, Paul L; Wiencke, John K; Kelsey, Karl T; Koestler, Devin C; Nissen, Emily; Alsup, Alexander; Molinaro, Annette M; Christensen, Brock C; Reiner, Alexander; Liu, Simin
创建时间:
2024-09-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作