Data from: Predicting the ancestral character changes in a tree is typically easier than predicting the root state
收藏Mendeley Data2024-06-25 更新2024-06-27 收录
下载链接:
https://zenodo.org/records/5001206
下载链接
链接失效反馈官方服务:
资源简介:
Predicting the ancestral sequences of a group of homologous sequences related by a phylogenetic tree has been the subject of many studies, and numerous methods have been proposed for this purpose. Theoretical results are available that show that when the substitution rate become too large, reconstructing the ancestral state at the tree root is no longer feasible. Here, we also study the reconstruction of the ancestral changes that occurred along the tree edges. We show that, depending on the tree and branch length distribution, reconstructing these changes (i.e. reconstructing the ancestral state of all internal nodes in the tree) may be easier or harder than reconstructing the ancestral root state. However, results from information theory indicate that for the standard Yule tree, the task of reconstructing internal node states remains feasible, even for very high substitution rates. Moreover, computer simulations demonstrate that for more complex trees and scenarios, this result still holds. For a large variety of counting, parsimony-based and likelihood-based methods, the predictive accuracy of a randomly selected internal node in the tree is indeed much higher than the accuracy of the same method when applied to the tree root. Moreover, parsimony- and likelihood-based methods appear to be remarkably robust to sampling bias and model mis-specification.
针对由系统发育树(phylogenetic tree)关联的一组同源序列的祖先序列预测问题,已得到诸多研究的深入探讨,目前已涌现出大量针对性的解决方案。已有理论研究表明,当替换率(substitution rate)过高时,重建系统发育树根节点的祖先状态(ancestral state)将不再可行。本文同时研究沿树分支发生的祖先演化变化的重建任务。研究发现,依据系统发育树结构与分支长度分布的差异,重建这类演化变化(即重建树内所有内部节点的祖先状态)的难度,相较于重建树的根节点祖先状态,可能更高或更低。不过,信息论相关研究结果显示,对于标准尤尔树(Yule tree)而言,即便替换率极高,重建内部节点祖先状态的任务依然可行。此外,计算机模拟实验证实,针对更复杂的系统发育树与演化场景,该结论依然成立。针对大量计数法、基于简约性(parsimony-based)与基于似然性(likelihood-based)的方法,随机选取树内某一内部节点的预测准确率,确实远高于将同一方法应用于树的根节点时的准确率。此外,基于简约性与似然性的方法对抽样偏差(sampling bias)及模型误设(model mis-specification)表现出极强的鲁棒性。
创建时间:
2023-06-28



