FastGGM: An Efficient Algorithm for the Inference of Gaussian Graphical Model in Biological Networks
收藏NIAID Data Ecosystem2026-03-09 收录
下载链接:
https://figshare.com/articles/dataset/FastGGM_An_Efficient_Algorithm_for_the_Inference_of_Gaussian_Graphical_Model_in_Biological_Networks/2294077
下载链接
链接失效反馈官方服务:
资源简介:
Biological networks provide additional information for the analysis of human diseases, beyond the traditional analysis that focuses on single variables. Gaussian graphical model (GGM), a probability model that characterizes the conditional dependence structure of a set of random variables by a graph, has wide applications in the analysis of biological networks, such as inferring interaction or comparing differential networks. However, existing approaches are either not statistically rigorous or are inefficient for high-dimensional data that include tens of thousands of variables for making inference. In this study, we propose an efficient algorithm to implement the estimation of GGM and obtain p-value and confidence interval for each edge in the graph, based on a recent proposal by Ren et al., 2015. Through simulation studies, we demonstrate that the algorithm is faster by several orders of magnitude than the current implemented algorithm for Ren et al. without losing any accuracy. Then, we apply our algorithm to two real data sets: transcriptomic data from a study of childhood asthma and proteomic data from a study of Alzheimer’s disease. We estimate the global gene or protein interaction networks for the disease and healthy samples. The resulting networks reveal interesting interactions and the differential networks between cases and controls show functional relevance to the diseases. In conclusion, we provide a computationally fast algorithm to implement a statistically sound procedure for constructing Gaussian graphical model and making inference with high-dimensional biological data. The algorithm has been implemented in an R package named “FastGGM”.
相较于聚焦单变量的传统人类疾病分析范式,生物网络可为疾病研究提供额外的信息维度。高斯图模型(Gaussian Graphical Model, GGM)是一种通过图结构表征一组随机变量条件依赖关系的概率模型,在生物网络分析中应用广泛,可用于推断分子间相互作用或比较差异网络。然而,现有方法要么在统计学上不够严谨,要么在处理包含数万个变量的高维数据进行推断时效率低下。本研究基于Ren等人2015年提出的最新方法,提出一种高效算法以实现高斯图模型的参数估计,并可计算图中每条边的p值与置信区间。通过模拟实验验证,本算法相较于Ren等人原方法的现有实现版本,运算速度提升数个数量级,且精度未受任何损失。随后,我们将所提算法应用于两个真实数据集:儿童哮喘研究的转录组数据,以及阿尔茨海默病研究的蛋白质组数据。我们分别针对疾病组与健康对照组样本,构建了全局基因或蛋白质相互作用网络。所得网络揭示了若干值得关注的相互作用,而病例组与对照组间的差异网络也展现出与疾病相关的功能关联性。综上,本研究提出了一种计算高效的算法,可在高维生物数据中实现严谨的统计学流程以构建高斯图模型并完成推断任务。该算法已封装至名为"FastGGM"的R语言工具包中。
创建时间:
2016-02-17



