five

DupLoss-2: Improved phylogenomic species tree inference under gene duplication and loss

收藏
DataONE2025-10-24 更新2025-11-01 收录
下载链接:
https://search.dataone.org/view/sha256:a7210336364a03aa9b9d0636642d7f2210a0ca385ca37209d1b1ed6b7657638a
下载链接
链接失效反馈
官方服务:
资源简介:
Accurate species tree reconstruction in the presence of widespread gene duplication and loss is a challenging problem in eukaryote phylogenomics. Many phylogenomics methods have been developed over the years to address this challenge; these range from older methods based on gene tree parsimony to newer quartet-based methods. In this work, we introduce improved software for gene tree parsimony-based species tree reconstruction under gene duplication and loss. The new software, DupLoss-2, uses an improved procedure for computing gene losses and is far more accurate and much easier to use than its previous version, released over a decade ago. We thoroughly evaluate DupLoss-2 and eight other existing methods, including ASTRAL-Pro, ASTRAL-Pro 2, DISCO-ASTRAL, DISCO-ASTRID, FastMulRFS, and SpeciesRax, using existing benchmarking data and find that DupLoss-2 outperforms all other methods on most of the datasets. It delivers an average of almost 30% reduction in reconstruction error compared to..., , , # DupLoss-2: Improved phylogenomic species tree inference under gene duplication and loss [https://doi.org/10.5061/dryad.0cfxpnwb9](https://doi.org/10.5061/dryad.0cfxpnwb9) ## Description of the data and file structure All datasets are simulated and were generated to evaluate methods for phylogenomic species tree inference under gene duplication and loss. ### Files and variables #### File: HighErrorData.zip **Description:** HighErrorData.zip includes estimated gene trees with a high rate of reconstruction error. The estimated gene trees correspond to simulated species trees with 100 taxa and true gene trees simulated under three different gene duplication and loss rates. The zip file consists of three folders corresponding to the three duplication-loss rates. Each folder contains 20 replicate datasets, each consisting of 1000 estimated gene trees. #### File: ElevatedLossRatesData.zip **Description:** ElevatedLossRatesData.zip contains the input data for an analysis in which a su...,
创建时间:
2025-10-25
二维码
社区交流群
二维码
科研交流群
商业服务