Data and Supplement from: Phylogenetic tree instability after taxon addition: Empirical frequency, predictability, and consequences for online inference
收藏DataCite Commons2025-06-01 更新2025-06-15 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.63xsj3v9x
下载链接
链接失效反馈官方服务:
资源简介:
Online phylogenetic inference methods add sequentially arriving sequences
to an inferred phylogeny without the need to recompute the entire tree
from scratch. Some online method implementations exist already, but there
remains concern that additional sequences may change the topological
relationship among the original set of taxa. We call such a change in tree
topology a lack of stability for the inferred tree. In this
paper, we analyze the stability of single taxon addition in a Maximum
Likelihood framework across 1,000 empirical datasets. We find that
instability occurs in almost 90% of our examples, although observed
topological differences do not always reach significance under the
AU-test. Changes in tree topology after addition of a taxon rarely occur
close to its attachment location, and are more frequently observed in more
distant tree locations carrying low bootstrap support. To investigate
whether instability is predictable, we hypothesize sources of instability
and design summary statistics addressing these hypotheses. Using these
summary statistics as input features for machine learning under random
forests, we are able to predict instability and can identify the most
influential features. In summary, it does not appear that a strict
insertion-only online inference method will deliver globally optimal
trees, although relaxing insertion strictness by allowing for a small
number of final tree rearrangements or accepting slightly suboptimal
solutions appears feasible.
提供机构:
Dryad
创建时间:
2024-11-01



