five

Data from: Practical performance of tree comparison metrics

收藏
Mendeley Data2024-06-25 更新2024-06-27 收录
下载链接:
https://zenodo.org/records/5000188
下载链接
链接失效反馈
官方服务:
资源简介:
The phylogenetic literature contains numerous measures for assessing differences between two phylogenetic trees. Individual measures have been criticized on various grounds, but little is known about their comparative performance in typical applications. We evaluate the performance of nine tree distance measures on two tasks: (1) distinguishing trees separated by lesser versus greater numbers of recombinations, and (2) distinguishing trees inferred with lower versus higher quality data. We find that when the trees being compared are similar, measures which make use of branch lengths are superior, with the branch-length version of the Robinson-Foulds metric (Robinson & Foulds, 1979) performing best. In contrast, for dissimilar trees topology-only measures are superior, with the Alignment metric of Nye et al. (2006) performing best. We also apply the measures to a mammalian data set and observe that the best metric depends on whether branch-length information is of interest. We give practical recommendations for choosing a tree distance metric in different applications.

系统发育学领域已有诸多用于评估两棵系统发育树(phylogenetic tree)之间差异的度量方法。尽管各类度量方法常因不同缘由受到诟病,但学界对其在典型应用场景下的相对性能仍知之甚少。本研究针对两项任务评估九种树距离度量(tree distance measure)的性能:其一为区分由不同数量重组事件分隔的系统发育树,其二为区分基于不同质量数据推断得到的系统发育树。研究发现,当待比较的系统发育树相似度较高时,利用分支长度信息的度量方法表现更优,其中带分支长度的罗宾逊-福尔兹度量(Robinson-Foulds metric,Robinson & Foulds, 1979)性能最佳。与之相反,当待比较的系统发育树差异较大时,仅考虑拓扑结构的度量方法表现更优,其中Nye等人(2006)提出的比对度量(Alignment metric)性能最佳。本研究还将上述度量方法应用于一套哺乳动物数据集,结果表明最优度量的选择取决于是否需要使用分支长度信息。最后,本研究针对不同应用场景下树距离度量的选择给出了实用建议。
创建时间:
2023-06-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作