five

Graphical Model Selection for Gaussian Conditional Random Fields in the Presence of Latent Variables

收藏
DataCite Commons2024-08-07 更新2024-08-17 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Graphical_Model_Selection_for_Gaussian_Conditional_Random_Fields_in_the_Presence_of_Latent_Variables/5885179/1
下载链接
链接失效反馈
官方服务:
资源简介:
We consider the problem of learning a conditional Gaussian graphical model in the presence of latent variables. Building on recent advances in this field, we suggest a method that decomposes the parameters of a conditional Markov random field into the sum of a sparse and a low-rank matrix. We derive convergence bounds for this estimator and show that it is well-behaved in the high-dimensional regime as well as “sparsistent” (i.e., capable of recovering the graph structure). We then show how proximal gradient algorithms and semi-definite programming techniques can be employed to fit the model to thousands of variables. Through extensive simulations, we illustrate the conditions required for identifiability and show that there is a wide range of situations in which this model performs significantly better than its counterparts, for example, by accommodating more latent variables. Finally, the suggested method is applied to two datasets comprising individual level data on genetic variants and metabolites levels. We show our results replicate better than alternative approaches and show enriched biological signal. Supplementary materials for this article are available online.

本文研究隐变量存在下的条件高斯图模型(conditional Gaussian graphical model)学习问题。基于本领域近期研究进展,本文提出一种将条件马尔可夫随机场(conditional Markov random field)的参数分解为稀疏矩阵与低秩矩阵之和的方法。本文推导了该估计器的收敛界,并证明其在高维场景下表现优良,同时具备"sparsistent"特性(即能够恢复图结构)。随后本文阐述了如何利用近端梯度算法(proximal gradient algorithm)与半正定规划(semi-definite programming)技术,将该模型适配至数千变量的场景。通过大量仿真实验,本文阐明了模型可识别性(identifiability)所需的条件,并证明在诸多场景下,该模型的表现显著优于同类方法——例如可容纳更多隐变量。最后,本文将所提方法应用于两个数据集,这两个数据集包含个体层面的遗传变异(genetic variants)与代谢物水平(metabolites levels)数据。实验结果表明,本文方法的结果相较于其他替代方法具有更优的可复现性,且能挖掘出更丰富的生物学信号。本文的补充材料可在线获取。
提供机构:
Taylor & Francis
创建时间:
2018-02-13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作