five

Data from: On the utility of deep learning for model classification and parameter estimation on complex diversification scenarios

收藏
DataCite Commons2026-05-12 更新2026-05-17 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.f7m0cfz6b
下载链接
链接失效反馈
官方服务:
资源简介:
Birth-Death models applied to dated phylogenies are a useful tool to study past diversification dynamics. Parameters in these stochastic models are typically inferred using likelihood-based methods such as Maximum Likelihood Estimation (MLE) or Bayesian Inference, though some of the most complex models present computational tractability issues. Recent years have witnessed the development of Deep Learning (DL) methods applied to evolutionary biology and phylogenetic inference. Here, we explore the power of Convolutional Neural Networks (CNNs), a type of DL method, to solve classification and regression (parameter estimation) tasks under six different rate-constant and rate-variable diversification scenarios: Constant Birth-Death, High-Extinction, Mass-Extinction, Diversity-Dependent, Stasis-and-Radiate, and Waxing-and-Waning. We simulated 10,000 phylogenetic trees under each diversification scenario, which were encoded using a vectorization procedure that captures the topology and branch length information. The encoded trees were used to train and test a set of CNN models that were designed to tailor three empirical case studies differing in the number of tips. We compared the CNN's performance with MLE inference. Our results show that CNNs exhibited classification accuracy levels of 90-80\%, whereas maximum likelihood estimation achieved levels of 69-60\%, using AIC as model selection criterion. The most difficult scenarios to predict for the CNNs were the high-extinction and mass-extinction scenarios, which were often misidentified as one another. For the regression tasks, CNN models obtained generally lower mean average errors than MLE inference, irrespective of the number of tips in the simulated phylogenies, though differences were small. The only exception was the discrete time event parameter in the episodic diversification scenarios (Mass-Extinction, Stasis-and-Radiate, and Waxing-and-Waning), in which MLE inference showed a lower error than the CNNs. Finally, we illustrate and discuss the application of our CNNs to real-world phylogenies, using three classic empirical case studies: eucalypts, conifers, and cetaceans.
提供机构:
Dryad
创建时间:
2026-05-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作