five

Fast Signal Region Detection With Application to Whole Genome Association Studies

收藏
DataCite Commons2025-06-01 更新2025-05-07 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Fast_Signal_Region_Detection_with_Application_to_Whole_Genome_Association_Studies/28395114/2
下载链接
链接失效反馈
官方服务:
资源简介:
Research on the localization of the genetic basis associated with diseases or traits has been widely conducted in the last few decades. Scan methods have been developed for region-based analysis in whole-genome association studies, helping us better understand how genetics influences human diseases or traits, especially when the aggregated effects of multiple causal variants are present. In this paper, we propose a fast and effective algorithm coupling with high-dimensional test for simultaneously detecting multiple signal regions, which is distinct from existing methods using scan or knockoff statistics. The idea is to conduct binary splitting with re-search and arrangement based on a sequence of dynamic critical values to increase detection accuracy and reduce computation. Theoretical and empirical studies demonstrate that our approach enjoys favorable theoretical guarantees with fewer restrictions and exhibits superior numerical performance with faster computation. Utilizing the UK Biobank data to identify the genetic regions related to breast cancer, we confirm previous findings and meanwhile, identify a number of new regions that suggest strong associations with risk of breast cancer and deserve further investigation. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.

近数十年来,针对疾病或性状相关遗传基础的定位研究已得到广泛开展。全基因组关联研究(whole-genome association studies)领域已开发出用于区域分析的扫描方法,可帮助我们更深入地理解遗传因素对人类疾病或性状的影响机制,尤其适用于存在多个因果变异聚合效应的场景。本文提出一种结合高维检验的快速高效算法,能够同时检测多个信号区域,这与现有采用扫描或Knockoff统计量的方法存在显著差异。该算法的核心思路是基于一系列动态临界值开展带有搜索与重排的二元分割,以此提升检测精度并降低计算开销。理论与实证研究表明,我们所提出的方法在约束条件更少的情况下具备优异的理论保障,且计算速度更快、数值表现更优。借助英国生物样本库(UK Biobank)的数据识别与乳腺癌相关的遗传区域,我们不仅验证了既往研究发现,还同时识别出多个新的遗传区域,这些区域与乳腺癌发病风险存在强关联,值得开展进一步研究。本文的补充材料可在线获取,其中包含可复现该研究工作的相关材料的标准化说明。
提供机构:
Taylor & Francis
创建时间:
2025-04-07
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作