Genetic Variant Set-Based Tests Using the Generalized Berk-Jones Statistic with Application to a Genome-Wide Association Study of Breast Cancer
收藏DataCite Commons2021-09-29 更新2024-07-27 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Genetic_Variant_Set-Based_Tests_Using_the_Generalized_Berk_Jones_Statistic_With_Application_to_a_Genome-Wide_Association_Study_of_Breast_Cancer/9816446/2
下载链接
链接失效反馈官方服务:
资源简介:
Studying the effects of groups of single nucleotide polymorphisms (SNPs), as in a gene, genetic pathway, or network, can provide novel insight into complex diseases like breast cancer, uncovering new genetic associations and augmenting the information that can be gleaned from studying SNPs individually. Common challenges in set-based genetic association testing include weak effect sizes, correlation between SNPs in a SNP-set, and scarcity of signals, with individual SNP effects often ranging from extremely sparse to moderately sparse in number. Motivated by these challenges, we propose the Generalized Berk-Jones (GBJ) test for the association between a SNP-set and outcome. The GBJ extends the Berk-Jones statistic by accounting for correlation among SNPs, and it provides advantages over the Generalized Higher Criticism test when signals in a SNP-set are moderately sparse. We also provide an analytic p-value calculation for SNP-sets of any finite size, and we develop an omnibus statistic that is robust to the degree of signal sparsity. An additional advantage of our work is the ability to conduct inference using individual SNP summary statistics from a genome-wide association study (GWAS). We evaluate the finite sample performance of the GBJ through simulation and apply the method to identify breast cancer risk genes in a GWAS conducted by the Cancer Genetic Markers of Susceptibility Consortium. Our results suggest evidence of association between FGFR2 and breast cancer and also identify other potential susceptibility genes, complementing conventional SNP-level analysis.
针对基因、遗传通路或基因网络中的单核苷酸多态性(single nucleotide polymorphisms, SNPs)集合开展效应研究,可为乳腺癌等复杂疾病提供全新的研究视角,发掘新的遗传关联,并补充通过单个SNP研究所能获取的信息。基于集合的遗传关联检验常面临多重挑战:效应量微弱、SNP集合内的SNP间存在相关性,以及信号稀缺——单个SNP效应的数量往往处于极稀疏至中度稀疏的区间。针对上述问题,我们提出了用于检验SNP集合与研究结局间关联的广义伯克-琼斯(Generalized Berk-Jones, GBJ)检验。GBJ通过考量SNP间的相关性拓展了伯克-琼斯统计量,且当SNP集合内信号呈中度稀疏时,其性能优于广义高显著性检验。此外,我们针对任意有限规模的SNP集合提供了解析p值计算方法,并开发了可对信号稀疏程度保持鲁棒性的综合统计量。本研究的另一项优势在于,可借助全基因组关联研究(genome-wide association study, GWAS)中获取的单个SNP汇总统计量开展推断分析。我们通过模拟实验评估了GBJ的有限样本性能,并将该方法应用于癌症遗传易感标志物联盟开展的GWAS数据,以识别乳腺癌风险基因。研究结果不仅证实了FGFR2与乳腺癌间存在关联,同时还发掘出其他潜在易感基因,为常规SNP水平分析提供了有益补充。
提供机构:
Taylor & Francis
创建时间:
2019-10-25



