nRCFV: A sequence, taxon and character state-normalised metric for the pre-reconstruction evaluation of compositional heterogeneity
收藏DataONE2023-02-02 更新2025-07-19 收录
下载链接:
https://search.dataone.org/view/sha256:3768d1eb239bc707f3240d6ad9132796bbcbc56548b0c943e63ff3801c3f2de5
下载链接
链接失效反馈官方服务:
资源简介:
Motivation
Compositional heterogeneity â when the proportions of nucleotides and amino acids are not broadly similar across the dataset â is a cause of a great number of phylogenetic artefacts. Whilst a variety of methods can identify it post-hoc, few metrics exist to quantify compositional heterogeneity prior to the computationally intensive task of phylogenetic tree reconstruction. Here we assess the efficacy of one such existing, widely used, metric: Relative Composition Frequency Variability (RCFV), using both real and simulated data.
Results
Our results show that RCFV can be biased by sequence length, the number of taxa, and the number of possible character states within the dataset. However, we also find that missing data does not appear to have an appreciable value on RCFV. We discuss the theory behind this and the consequences of this for the future of the usage of the RCFV value and propose a new metric, nRCFV, which accounts for these biases. Alongside this, we present a new s..., ,
创建时间:
2025-07-17



