Data from: Species tree branch length estimation despite incomplete lineage sorting, duplication, and loss
收藏DataCite Commons2026-01-29 更新2026-04-25 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.70rxwdcc8
下载链接
链接失效反馈官方服务:
资源简介:
Phylogenetic branch lengths are essential for many analyses, such as
estimating divergence times, analyzing rate changes, and studying
adaptation. However, true gene tree heterogeneity due to incomplete
lineage sorting, gene duplication and loss, and horizontal gene transfer
can complicate the estimation of species tree branch lengths. While
several tools exist for estimating the topology of a species tree
addressing various causes of gene tree discordance, much less attention
has been paid to branch length estimation on multi-locus datasets. For
single-copy gene trees, some methods are available that summarize gene
tree branch lengths onto a species tree, including coalescent-based
methods that account for heterogeneity due to incomplete lineage sorting.
However, no such branch length estimation method exists for multi-copy
gene family trees that have evolved with gene duplication and loss. To
address this gap, we introduce the CASTLES-Pro algorithm for estimating
species tree branch lengths while accounting for both gene duplication and
loss and incomplete lineage sorting. CASTLES-Pro improves on the existing
coalescent-based branch length estimation method CASTLES by increasing its
accuracy for single-copy gene trees and extending it to handle multi-copy
ones. Our simulation studies show that CASTLES-Pro is generally more
accurate than alternatives, eliminating the systematic bias toward
overestimating terminal branch lengths often observed when using
concatenation. Moreover, while not theoretically designed for horizontal
gene transfer, we show that CASTLES-Pro is relatively robust to random
horizontal gene transfer, though its accuracy can degrade at the highest
levels of horizontal gene transfer.
提供机构:
Dryad
创建时间:
2025-12-16



