Supplementary Material for: A Non-Parametric Method for Building Predictive Genetic Tests on High-Dimensional Data
收藏NIAID Data Ecosystem2026-03-07 收录
下载链接:
https://figshare.com/articles/dataset/Supplementary_Material_for_A_Non-Parametric_Method_for_Building_Predictive_Genetic_Tests_on_High-Dimensional_Data/5121838
下载链接
链接失效反馈官方服务:
资源简介:
Objective: Predictive tests that capitalize on emerging genetic findings hold great promise for enhanced personalized healthcare. With the emergence of a large amount of data from genome-wide association studies (GWAS), interest has shifted towards high-dimensional risk prediction.Methods: To form predictive genetic tests on high-dimensional data, we propose a non-parametric method, called the ‘forward ROC method’. The method adopts a computationally efficient algorithm to search for environment risk factors, genetic predictors on the entire genome, and their possible interactions for an optimal risk prediction model, without relying on prior knowledge of known risk factors. An efficient yet powerful procedure is also incorporated into the method to handle missing data. Results:Through simulations and real data applications, we found our proposed method outperformed the existing approaches. We applied the new method to the Wellcome Trust rheumatoid arthritis GWAS dataset with a total of 460,547 markers. The results from the risk prediction analysis suggested important roles of HLA-DRB1 and PTPN22 in predicting rheumatoid arthritis. Conclusion: We proposed a powerful and robust approach for high-dimensional risk prediction. The new method will facilitate future risk prediction that considers a large number of predictors and their interaction for improved performance.
研究目标:依托新兴遗传学研究成果的预测性检测技术,在优化个性化医疗方面展现出巨大潜力。随着全基因组关联研究(Genome-Wide Association Studies, GWAS)产出海量数据,相关研究的焦点已转向高维风险预测方向。研究方法:为构建高维数据下的遗传预测模型,本文提出一种名为“正向ROC法”的非参数方法。该方法采用计算高效的算法,在无需依赖已知风险因子先验知识的前提下,于全基因组范围内搜寻环境风险因子、遗传预测因子及其潜在交互作用,以构建最优风险预测模型。此外,该方法还集成了一套高效且稳健的缺失数据处理流程。研究结果:经模拟实验与真实数据应用验证,本文提出的方法性能优于现有同类方法。我们将该新方法应用于包含460547个遗传标记的威康信托基金会类风湿关节炎GWAS数据集,风险预测分析结果表明,HLA-DRB1与PTPN22在类风湿关节炎的风险预测中发挥了重要作用。研究结论:本文提出了一种适用于高维风险预测的高效稳健方法。该新方法将助力未来开展纳入大量预测因子及其交互作用的风险预测研究,从而进一步提升预测性能。
创建时间:
2017-06-20



