A scalable, accurate, and universal analysis framework to control for sample relatedness in large-scale genome-wide association studies and its application to 79 longitudinal traits in UK Biobank
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10242061
下载链接
链接失效反馈官方服务:
资源简介:
Sample relatedness is a major confounder in large-scale GWAS and could result in inflation if not appropriately controlled. Incorporating GRM-related random effects into the conventional models is the mostly used strategy. Although effective, it is technically challenging to extend this strategy to other complex traits with complicated structure. In this work, we propose a scalable, accurate, and universal analysis framework, SPAGRM, in which the sample relatedness is controlled via the precise approximation of the joint distribution of genotypes for related samples in families. SPAGRM can utilize GRM-free conventional models and thus is applicable to a wide variety of traits. A hybrid strategy including saddlepoint approximation (SPA) can greatly increase the accuracy to analyze low-frequency and rare genetic variants, especially if the phenotypic distribution is unbalanced. Extensive simulation studies and real data analyses validated that SPAGRM is accurate to control type I error rates and can gain power for a longitudinal trait analysis. Expanding upon the previous studies, we implemented a refined and meticulous QC pipeline to extract 79 longitudinal traits from UK Biobank primary care data. The application of SPAGRM to the 79 longitudinal traits identified 7,463 genetic loci, which is a pioneering attempt to conduct GWAS for a majority of these traits as a longitudinal phenotype.
创建时间:
2023-12-03



