Table2_Decoding Genetic Markers of Multiple Phenotypic Layers Through Biologically Constrained Genome-To-Phenome Bayesian Sparse Regression.csv
收藏NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://figshare.com/articles/dataset/Table2_Decoding_Genetic_Markers_of_Multiple_Phenotypic_Layers_Through_Biologically_Constrained_Genome-To-Phenome_Bayesian_Sparse_Regression_csv/19452548
下载链接
链接失效反馈官方服务:
资源简介:
The applicability of multivariate approaches for the joint analysis of genomics and phenomics information is currently limited by the lack of scalability, and by the difficulty of interpreting the related findings from a biological perspective. To tackle these limitations, we present Bayesian Genome-to-Phenome Sparse Regression (G2PSR), a novel multivariate regression method based on sparse SNP-gene constraints. The statistical framework of G2PSR is based on a Bayesian neural network, were constraints on SNPs-genes associations are integrated by incorporating a priori knowledge linking variants to their respective genes, to then reconstruct the phenotypic data in the output layer. Interpretability is promoted by inducing sparsity on the genes through variational dropout, allowing to estimate the uncertainty associated with each gene, and related SNPs, in the reconstruction task. Ultimately, G2PSR is conceived to prevent multiple testing correction and to assess the combined effect of SNPs, thus increasing the statistical power in detecting genome-to-phenome associations. The effectiveness of G2PSR was demonstrated on synthetic and real data, with respect to state-of-the-art methods based on group-wise sparsity constraints. The application on real data consisted in an imaging-genetics analysis on the Alzheimer’s Disease Neuroimaging Initiative data, relating SNPs from more than 3,500 genes to clinical and multi-variate brain volumetric information. The experimental results show that our method can provide accurate selection of relevant genes in dataset with large SNPs-to-samples ratio, thus overcoming the main limitations of current genome-to-phenome association methods.
当前,用于基因组学与表型组学联合分析的多变量方法,其应用受到可扩展性不足的限制,且难以从生物学视角对相关研究结果进行解读。为解决上述局限,本文提出贝叶斯基因组-表型组稀疏回归(Bayesian Genome-to-Phenome Sparse Regression, G2PSR),这是一种基于稀疏单核苷酸多态性(SNP)-基因约束的新型多变量回归方法。G2PSR的统计框架基于贝叶斯神经网络,通过引入将变异位点与其对应基因相关联的先验知识,整合SNP-基因关联约束,并在输出层重构表型组数据。本方法通过变分dropout对基因引入稀疏性,提升模型可解释性,从而能够在重构任务中估计每个基因及其关联SNP的不确定性。本质上,G2PSR旨在规避多重检验校正问题,并可评估SNP的联合效应,进而提升基因组-表型组关联检测的统计效力。本文在合成数据集与真实数据集上,针对基于分组稀疏约束的当前最优方法开展对比实验,验证了G2PSR的有效性。其中真实数据集实验采用阿尔茨海默病神经影像倡议(Alzheimer’s Disease Neuroimaging Initiative, ADNI)数据开展影像遗传学分析,将超过3500个基因对应的SNP与临床信息及多变量脑容积数据相关联。实验结果表明,在SNP-样本比极高的数据集上,本方法可精准筛选相关基因,从而克服当前主流基因组-表型组关联分析方法的核心局限。
创建时间:
2022-03-30



