five

A Tale of Two Datasets: Representativeness and Generalisability of Inference for Samples of Networks

收藏
DataCite Commons2023-10-26 更新2024-08-18 收录
下载链接:
https://tandf.figshare.com/articles/dataset/A_Tale_of_Two_Datasets_Representativeness_and_Generalisability_of_Inference_for_Samples_of_Networks/23925915
下载链接
链接失效反馈
官方服务:
资源简介:
The last two decades have seen considerable progress in foundational aspects of statistical network analysis, but the path from theory to application is not straightforward. Two large, heterogeneous samples of small networks of within-household contacts in Belgium were collected using two different but complementary sampling designs: one smaller but with all contacts in each household observed, the other larger and more representative but recording contacts of only one person per household. We wish to combine their strengths to learn the social forces that shape household contact formation and facilitate simulation for prediction of disease spread, while generalising to the population of households in the region. To accomplish this, we describe a flexible framework for specifying multi-network models in the exponential family class and identify the requirements for inference and prediction under this framework to be consistent, identifiable, and generalisable, even when data are incomplete; explore how these requirements may be violated in practice; and develop a suite of quantitative and graphical diagnostics for detecting violations and suggesting improvements to candidate models. We report on the effects of network size, geography, and household roles on household contact patterns (activity, heterogeneity in activity, and triadic closure). Supplementary materials for this article are available online.

近二十年来,统计网络分析的基础研究已取得长足进展,但从理论落地至应用的路径却并非一帆风顺。本研究采用两种不同且互补的抽样方案,采集了比利时家庭内部接触小型网络的两组大型异质性样本:第一组规模较小,但完整记录了每户家庭内的全部接触关系;第二组规模更大、代表性更强,但仅采集每户中一名成员的接触记录。我们希望融合两组样本的优势,探究塑造家庭接触关系形成的社会动因,为疾病传播预测的模拟工作提供支撑,同时将研究结论推广至该区域的所有家庭群体。为达成这一目标,我们提出了一套用于指定指数族类多网络模型的灵活框架,并明确了在此框架下实现推理与预测所需满足的一致性、可识别性与可推广性条件,即便在数据不完整的场景下亦成立;我们还探究了这些条件在实际应用中可能被违反的情形,并开发了一套定量与图形化诊断工具,用于检测模型违反情况并为候选模型的优化提供改进方向。我们还分析了网络规模、地理区位与家庭角色对家庭接触模式(包括接触活跃度、活跃度异质性及三元闭合性)的影响。本文的补充材料可在线获取。
提供机构:
Taylor & Francis
创建时间:
2023-08-10
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作