five

Data from: Coalescent-based branch length estimation improves dating of species trees

收藏
DataCite Commons2026-04-03 更新2026-04-25 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.hmgqnk9xv
下载链接
链接失效反馈
官方服务:
资源简介:
Species trees need to be dated for many downstream applications. Typical molecular dating methods take a phylogenetic tree with branch lengths in substitution units, as well as a set of calibrations, as input and convert the branch lengths of the species tree to the unit of time, while being consistent with the pre-specified calibrations. When dating species trees from multi-locus genome-scale datasets, the branch lengths and sometimes the topology of the species tree are estimated using concatenation. However, concatenation does not address gene tree heterogeneity across the genome. While Bayesian dating methods can address some forms of gene tree heterogeneity, such as incomplete lineage sorting, they are not scalable to large datasets. In this paper, we introduce a new scalable pipeline for dating species trees that addresses gene tree discordance for both topology and branch length estimation. The pipeline uses discordance-aware methods that account for incomplete lineage sorting for estimating the topology and branch lengths, and maximum likelihood-based methods for the dating step. Our simulation study on datasets with gene tree discordance shows that this pipeline produces more accurate and less biased dates than pipelines that use concatenation. Furthermore, it is substantially more scalable and can handle datasets with thousands of species and genes. Our results on two biological datasets demonstrate that this new pipeline improves the inference of node ages and branch lengths for certain nodes, particularly those closer to the tree tips, and improves the downstream diversification analysis.
提供机构:
Dryad
创建时间:
2026-04-03
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作