five

Ultra-high Dimensional Quantile Regression for Longitudinal Data: an Application to Blood Pressure Analysis

收藏
DataCite Commons2024-02-15 更新2024-07-29 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Ultra-high_Dimensional_Quantile_Regression_for_Longitudinal_Data_an_Application_to_Blood_Pressure_Analysis/21235588/1
下载链接
链接失效反馈
官方服务:
资源简介:
Despite major advances in research and treatment, identifying important genotype risk factors for high blood pressure remains challenging. Traditional genome-wide association studies (GWAS) focus on one single nucleotide polymorphism (SNP) at a time. We aim to select among over half a million SNPs along with time-varying phenotype variables via simultaneous modeling and variable selection, focusing on the most dangerous blood pressure levels at high quantiles. Taking advantage of rich data from a large-scale public health study, we develop and apply a novel quantile penalized generalized estimating equations (GEE) approach, incorporating several key aspects including ultra-high dimensional genetic SNPs, the longitudinal nature of blood pressure measurements, time-varying covariates, and conditional high quantiles of blood pressure. Importantly, we identify interesting new SNPs for high blood pressure. Besides, we find blood pressure levels are likely heterogeneous, where the important risk factors identified differ among quantiles. This comprehensive picture of conditional quantiles of blood pressure can allow more insights and targeted treatments. We provide an efficient computational algorithm and prove consistency, asymptotic normality, and the oracle property for the quantile penalized GEE estimators with ultra-high dimensional predictors. Moreover, we establish model-selection consistency for high-dimensional BIC. Simulation studies show the promise of the proposed approach.

尽管在高血压的研究与治疗领域已取得重大进展,但识别高血压的关键基因型风险因素仍颇具挑战。传统全基因组关联研究(Genome-Wide Association Studies, GWAS)每次仅聚焦于单个单核苷酸多态性(Single Nucleotide Polymorphism, SNP)。本研究旨在通过同时建模与变量选择的方法,在超过50万个单核苷酸多态性(SNP)以及时变表型变量中开展筛选,重点关注高分位数下的高危血压水平。本研究依托一项大规模公共卫生研究的丰富数据,提出并应用了一种全新的分位数惩罚广义估计方程(Quantile Penalized Generalized Estimating Equations, GEE)方法,该方法整合了多项关键特性:包括超高维遗传SNP、血压测量的纵向特性、时变协变量以及血压的条件高分位数特征。尤为重要的是,本研究识别出了若干与高血压相关的全新SNP位点。此外,本研究发现血压水平可能存在异质性,即不同分位数下识别出的关键风险因素存在差异。这一针对血压条件分位数的全面刻画,能够为相关研究提供更深入的见解,并助力精准治疗的实施。本研究提出了一种高效的计算算法,并证明了针对超高维预测变量的分位数惩罚广义估计方程估计量具有一致性、渐近正态性以及神谕性质。此外,本研究还证明了高维贝叶斯信息准则(Bayesian Information Criterion, BIC)的模型选择一致性。模拟研究结果证实了所提方法的应用前景。
提供机构:
Taylor & Francis
创建时间:
2022-09-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作