five

Data from: Two-phase importance sampling for inference about transmission trees

收藏
DataONE2014-08-29 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈
官方服务:
资源简介:
There has been growing interest in the statistics community to develop methods for inferring transmission pathways of infectious pathogens from molecular sequence data. For many datasets, the computational challenge lies in the huge dimension of the missing data. Here, we introduce an importance sampling scheme in which the transmission trees and phylogenies of pathogens are both sampled from reasonable importance distributions, alleviating the inference. Using this approach, arbitrary models of transmission could be considered, contrary to many earlier proposed methods. We illustrate the scheme by analysing transmissions of Streptococcus pneumoniae from household to household within a refugee camp, using data in which only a fraction of hosts is observed, but which is still rich enough to unravel the within-household transmission dynamics and pairs of households between whom transmission is plausible. We observe that while probability of direct transmission is low even for the most prominent cases of transmission, still those pairs of households are geographically much closer to each other than expected under random proximity.

统计学界对于开发基于分子序列数据的传染性病原体传播路径推断方法的关注度与日俱增。对于多数数据集而言,其计算瓶颈在于缺失数据的维度规模极其庞大。本文提出一种重要性采样(importance sampling)框架,其中病原体传播树与系统发育树均从合理的重要性分布中采样,以此缓解推断过程的计算压力。相较于诸多此前提出的方法,本方法可支持任意传播模型的适配与应用。我们通过分析难民营内家庭间的肺炎链球菌(Streptococcus pneumoniae)传播案例来阐释该框架:所用数据集仅观测到部分宿主个体,但仍具备足够丰富的信息,可解析家庭内部的传播动力学特征,以及传播可能性较高的家庭配对关系。我们发现,即便在最为显著的传播案例中,直接传播的概率依然较低,但这些家庭配对的地理距离仍远低于随机邻近假设下的预期值。
创建时间:
2014-08-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作