Regularized Cross-Sectional Network Modeling with Missing Data: A Comparison of Methods

Name: Regularized Cross-Sectional Network Modeling with Missing Data: A Comparison of Methods
Creator: Taylor & Francis
Published: 2025-09-17 08:00:26
License: 暂无描述

DataCite Commons2025-09-17 更新2026-02-09 收录

下载链接：

https://tandf.figshare.com/articles/dataset/Regularized_Cross-Sectional_Network_Modeling_with_Missing_Data_A_Comparison_of_Methods/30145926/1

下载链接

链接失效反馈

官方服务：

资源简介：

Many applications of network modeling involve cross-sectional data of psychological variables (e.g., symptoms for psychological disorders), and analyses are often conducted using a regularized Gaussian graphical model (GGM) employing a lasso, also known as the graphical lasso or <i>glasso</i>. Appropriate methodology for handling missing data is underdeveloped while using glasso, precluding the use of planned missing data designs to reduce participant fatigue. In this research, we compare three approaches to handling missing data with glasso. The first resembles a two-stage estimation approach—borrowed from the covariance structure modeling literature—whereby a saturated covariance matrix among the items is estimated prior to using glasso. The second and third approaches use glasso and the expectation-maximization (EM) algorithm in a single stage and either use EBIC or cross-validation for tuning parameter selection. We compared these approaches in a simulation study with a variety of sample sizes, proportions of missing data, and network saturation. An example with data from the Patient Reported Outcomes Measurement Information System is also provided. The EM algorithm with cross-validation performed best, but all methods appeared to be viable strategies under larger samples and with less missing data.

网络建模的诸多应用场景涉及心理变量的横断面数据（例如心理障碍的症状维度），此类分析常采用引入套索（lasso）正则化的高斯图模型（regularized Gaussian graphical model, GGM），该方法也被称为图套索（graphical lasso）或glasso。但当前针对glasso应用场景下的缺失数据处理方法尚不完善，这使得研究者无法使用计划性缺失数据设计来降低被试疲劳。本研究针对glasso的缺失数据处理问题，对比了三类方法：第一类方法借鉴自协方差结构模型领域的研究思路，先估计各条目间的饱和协方差矩阵，再将其用于glasso分析；第二类与第三类方法均采用单阶段流程，将glasso与期望最大化（expectation-maximization, EM）算法结合，二者的区别在于调参选择分别使用扩展贝叶斯信息准则（EBIC）与交叉验证。本研究通过模拟实验对比了这三类方法的性能，实验中设置了不同的样本量、缺失数据比例与网络饱和度；此外还提供了一则基于患者报告结局测量信息系统（Patient Reported Outcomes Measurement Information System）数据的应用实例。结果显示，结合交叉验证的EM算法表现最优，但在样本量较大、缺失数据比例较低的场景下，三类方法均具备可行性。

提供机构：

Taylor & Francis

创建时间：

2025-09-17

5,000+

优质数据集

54 个

任务类型

进入经典数据集