five

Data from: Pruning rogue taxa improves phylogenetic accuracy: an efficient algorithm and webservice

收藏
DataONE2012-10-12 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈
官方服务:
资源简介:
The presence of rogue taxa (rogues) in a set of trees can frequently have a negative impact on the results of a bootstrap analysis (e.g., the overall support in consensus trees). We introduce an efficient graph-based algorithm for rogue taxon identification as well as an interactive web-service implementing this algorithm. Compared to our previous method, the new algorithm is up to four orders of magnitude faster, while returning qualitatively identical results. Because of this significant improvement in scalability, the new algorithm can now identify substantially more complex and compute-intensive rogue taxon constellations. On a large and diverse collection of real-world datasets, we show that, our method yields better supported reduced/pruned consensus trees than any competing rogue taxon identification method. Using the parallel version of our open-source code, we successfully identified rogue taxa in a set of 100 trees with 116,334 taxa each. Using simulated datasets we show that, when removing/pruning rogue taxa with our method from a tree set, we consistently obtain bootstrap consensus trees as well as maximum likelihood trees that are topologically closer to the respective true trees.

当树集合中存在漂移类群(rogue taxon)时,常会对自展分析(bootstrap analysis)的结果——例如合意树(consensus trees)的整体支持度——产生负面影响。本文提出一种高效的基于图的漂移类群识别算法,以及实现该算法的交互式网络服务(web-service)。相较于本团队此前的方法,新算法的运行速度最高可提升四个数量级,同时可保证所得结果的定性一致性。得益于可扩展性方面的显著改进,新算法如今可识别复杂度与计算量均大幅更高的漂移类群组合。在大型且多样化的真实数据集集合上,我们的方法所生成的精简/剪枝合意树,其支持度优于所有同类竞争性漂移类群识别方法。借助开源代码(open-source code)的并行版本,我们成功在包含100棵树、每棵树含116334个类群的数据集集合中完成了漂移类群识别。通过模拟数据集验证,我们证明:当使用本文方法移除或剪去树集合中的漂移类群后,所得到的自展合意树与最大似然树(maximum likelihood trees)在拓扑结构上均更接近对应的真实树。
创建时间:
2012-10-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作