Data from: Phylogenomic subsampling and the search for phylogenetically reliable loci

Name: Data from: Phylogenomic subsampling and the search for phylogenetically reliable loci
Creator: Dryad
Published: 2026-03-05 22:02:19
License: 暂无描述

DataCite Commons2026-03-05 更新2025-04-09 收录

下载链接：

https://datadryad.org/dataset/doi:10.5061/dryad.sj3tx9646

下载链接

链接失效反馈

官方服务：

资源简介：

Phylogenomic subsampling is a procedure by which small sets of loci are selected from large genome-scale datasets and used for phylogenetic inference. This step is often motivated by either computational limitations associated with the use of complex inference methods, or as a means of testing the robustness of phylogenetic results by discarding loci that are deemed potentially misleading. Although many alternative methods of phylogenomic subsampling have been proposed, little effort has gone into comparing their behavior across different datasets. Here, I calculate multiple gene properties for a range of phylogenomic datasets spanning animal, fungal and plant clades, uncovering a remarkable predictability in their patterns of covariance. I also show how these patterns provide a means for ordering loci by both their rate of evolution and their relative phylogenetic usefulness. This method of retrieving phylogenetically useful loci is found to be among the top performing when compared to alternative subsampling protocols. Relatively common approaches such as minimizing potential sources of systematic bias or increasing the clock-likeness of the data are found to fare worse than selecting loci at random. Likewise, the general utility of rate-based subsampling is found to be limited: loci evolving at both low and high rates are among the least effective, and even those evolving at optimal rates can still widely differ in usefulness. This study shows that many common subsampling approaches introduce unintended effects in off-target gene properties, and proposes an alternative multivariate method that simultaneously optimizes phylogenetic signal while controlling for known sources of bias.

提供机构：

Dryad

创建时间：

2021-06-09

5,000+

优质数据集

54 个

任务类型

进入经典数据集