Ultra-High Dimensional Quantile Regression for Longitudinal Data: An Application to Blood Pressure Analysis
收藏DataCite Commons2025-04-01 更新2024-07-29 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Ultra-high_Dimensional_Quantile_Regression_for_Longitudinal_Data_an_Application_to_Blood_Pressure_Analysis/21235588/3
下载链接
链接失效反馈官方服务:
资源简介:
Despite major advances in research and treatment, identifying important genotype risk factors for high blood pressure remains challenging. Traditional genome-wide association studies (GWAS) focus on one single nucleotide polymorphism (SNP) at a time. We aim to select among over half a million SNPs along with time-varying phenotype variables via simultaneous modeling and variable selection, focusing on the most dangerous blood pressure levels at high quantiles. Taking advantage of rich data from a large-scale public health study, we develop and apply a novel quantile penalized generalized estimating equations (GEE) approach, incorporating several key aspects including ultra-high dimensional genetic SNPs, the longitudinal nature of blood pressure measurements, time-varying covariates, and conditional high quantiles of blood pressure. Importantly, we identify interesting new SNPs for high blood pressure. Besides, we find blood pressure levels are likely heterogeneous, where the important risk factors identified differ among quantiles. This comprehensive picture of conditional quantiles of blood pressure can allow more insights and targeted treatments. We provide an efficient computational algorithm and prove consistency, asymptotic normality, and the oracle property for the quantile penalized GEE estimators with ultra-high dimensional predictors. Moreover, we establish model-selection consistency for high-dimensional BIC. Simulation studies show the promise of the proposed approach. Supplementary materials for this article are available online.
尽管在高血压(high blood pressure)的研究与治疗领域已取得重大进展,但识别其关键基因型风险因子仍颇具挑战。传统全基因组关联研究(genome-wide association studies, GWAS)每次仅聚焦于单个单核苷酸多态性(single nucleotide polymorphism, SNP)。本研究旨在通过同步建模与变量选择,在超过50万个SNP以及时变表型变量中进行筛选,重点关注高分位数下的高危血压水平。本研究依托一项大规模公共卫生研究的丰富数据,提出并应用了一种全新的分位数惩罚广义估计方程(quantile penalized generalized estimating equations, GEE)方法,该方法纳入了多项关键要素:超高维遗传SNP、血压测量的纵向特性、时变协变量,以及血压的条件高分位数。尤为重要的是,本研究识别出了若干与高血压相关的新型SNP位点。此外,研究发现血压水平存在显著异质性,不同分位数下识别出的关键风险因子存在差异。这种针对血压条件分位数的全面分析框架,能够为相关研究提供更深入的见解,并助力精准治疗方案的开发。本研究提出了一种高效的计算算法,并证明了针对超高维预测变量的分位数惩罚GEE估计量具有一致性、渐近正态性以及神谕性质(oracle property)。此外,本研究还证明了超高维贝叶斯信息准则(Bayesian Information Criterion, BIC)的模型选择一致性。模拟实验结果证实了所提方法的有效性与应用前景。本文的补充材料可在线获取。
提供机构:
Taylor & Francis
创建时间:
2022-11-11
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集专注于利用超高维分位数回归方法分析纵向血压数据,旨在识别与高风险血压水平相关的基因型因素。研究通过结合超高维遗传SNP和纵向数据特性,提供了对血压条件分位数的全面分析,有助于发现新的SNP和异质性风险因素。
以上内容由遇见数据集搜集并总结生成



