TreeShrink: fast and accurate detection of outlier long branches in collections of phylogenetic trees
收藏DataONE2023-06-26 更新2025-08-09 收录
下载链接:
https://search.dataone.org/view/sha256:a33276d30eb7b192d9147befc3c65689e5b5ec66f30516e9ff0148b1c9908943
下载链接
链接失效反馈官方服务:
资源简介:
Phylogenetic trees include errors for a variety of reasons. We argue that one way to detect errors is to build a phylogeny with all the data and then detect taxa that artificially inflate the tree diameter. We formulate an optimization problem that seeks to find k leaves that can be removed to reduce the tree diameter maximally. We present a polynomial time solution to this âk-shrinkâ problem. Given this solution, we then use non-parametric statistics to find an outlier set of taxa that have an unexpectedly high impact on the tree diameter. We test our method, TreeShrink, on five biological datasets, and show that it is more conservative than rogue taxon removal using RogueNaRok. When the amount of filtering is controlled, TreeShrink outperforms RogueNaRok in three out of the five datasets, and they tie in another dataset., All the raw data are obtained from other publications as shown below. We further analyzed the data and provide the results of the analyses here. The methods used to analyze the data are described in the paper.
Dataset
Species
Genes
Download
PlantsÂ
104
852
DOIÂ 10.1186/2047-217X-3-17
Mammals
37
424
DOIÂ 10.13012/C5BG2KWG
Insects
144
1478
http://esayyari.github.io/InsectsData
Cannon
78
213
DOIÂ 10.5061/dryad.493b7
RouseÂ
26
393
DOIÂ 10.5061/dryad.79dq1
Frogs
164
95
DOIÂ 10.5061/dryad.12546.2
 ,
创建时间:
2025-07-22



