Data from: On the utility of deep learning for model classification and parameter estimation on complex diversification scenarios
收藏DataCite Commons2026-05-12 更新2026-05-17 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.f7m0cfz6b
下载链接
链接失效反馈官方服务:
资源简介:
Birth-Death models applied to dated phylogenies are a useful tool to study
past diversification dynamics. Parameters in these stochastic models are
typically inferred using likelihood-based methods such as Maximum
Likelihood Estimation (MLE) or Bayesian Inference, though some of the most
complex models present computational tractability issues. Recent years
have witnessed the development of Deep Learning (DL) methods applied to
evolutionary biology and phylogenetic inference. Here, we explore the
power of Convolutional Neural Networks (CNNs), a type of DL method, to
solve classification and regression (parameter estimation) tasks under six
different rate-constant and rate-variable diversification scenarios:
Constant Birth-Death, High-Extinction, Mass-Extinction,
Diversity-Dependent, Stasis-and-Radiate, and Waxing-and-Waning. We
simulated 10,000 phylogenetic trees under each diversification scenario,
which were encoded using a vectorization procedure that captures the
topology and branch length information. The encoded trees were used to
train and test a set of CNN models that were designed to tailor three
empirical case studies differing in the number of tips. We compared the
CNN's performance with MLE inference. Our results show that CNNs
exhibited classification accuracy levels of 90-80\%, whereas maximum
likelihood estimation achieved levels of 69-60\%, using AIC as model
selection criterion. The most difficult scenarios to predict for the CNNs
were the high-extinction and mass-extinction scenarios, which were often
misidentified as one another. For the regression tasks, CNN models
obtained generally lower mean average errors than MLE inference,
irrespective of the number of tips in the simulated phylogenies, though
differences were small. The only exception was the discrete time event
parameter in the episodic diversification scenarios (Mass-Extinction,
Stasis-and-Radiate, and Waxing-and-Waning), in which MLE inference showed
a lower error than the CNNs. Finally, we illustrate and discuss the
application of our CNNs to real-world phylogenies, using three classic
empirical case studies: eucalypts, conifers, and cetaceans.
提供机构:
Dryad
创建时间:
2026-05-12



