Training and test data with scripts for simulation-trained deep learning and likelihood-based phylogeography comparisons
收藏DataCite Commons2026-03-24 更新2026-04-25 收录
下载链接:
https://datadryad.org/dataset/doi:10.25338/B8SH2J
下载链接
链接失效反馈官方服务:
资源简介:
Analysis of phylogenetic trees has become an essential tool in
epidemiology. Likelihood-based methods fit models to phylogenies to draw
inferences about the phylodynamics and history of viral transmission.
However, these methods are computationally expensive, which limits the
complexity and realism of phylodynamic models and makes them ill-suited
for informing policy decisions in real-time during rapidly developing
outbreaks. Likelihood-free methods using deep learning are pushing the
boundaries of inference beyond these constraints. In this paper, we
extend, compare and contrast a recently developed deep learning method for
likelihood-free inference from trees. We trained multiple deep neural
networks using phylogenies from simulated outbreaks that spread among five
locations and found they achieve similar levels of accuracy to Bayesian
inference under the true simulation model. We compared robustness to model
misspecification of a trained neural network to that of a Bayesian method.
We found that both models had comparable performance, converging on
similar biases. We also trained and tested a neural network against
phylogeographic data from a recent study of the SARS-Cov-2 pandemic in
Europe and obtained similar estimates of epidemiological parameters and
the location of the common ancestor in Europe. Along with being as
accurate and robust as likelihood-based methods, our trained neural
networks are on average over 3 orders of magnitude faster. Our results
support the notion that neural networks can be trained with simulated data
to accurately mimic the good and bad statistical properties of the
likelihood functions of generative phylogenetic models.
提供机构:
Dryad
创建时间:
2023-10-04



