Compositionally constrained sites drive long branch attraction
收藏DataCite Commons2026-03-12 更新2026-04-25 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.g79cnp5rh
下载链接
链接失效反馈官方服务:
资源简介:
Accurate phylogenies are fundamental to our understanding of the pattern
and process of evolution. Yet, phylogenies at deep evolutionary
timescales, with correspondingly long branches, have been fraught with
controversy resulting from conflicting estimates from models with varying
complexity and goodness of fit. Analyses of historical as well as current
empirical datasets, such as alignments including Microsporidia, Nematoda
or Platyhelminthes, have demonstrated that inadequate modeling of
across-site compositional heterogeneity, which is the result of
biochemical constraints that lead to varying patterns of accepted amino
acids along sequences, can lead to erroneous topologies that are strongly
supported. Unfortunately, models that adequately account for across-site
compositional heterogeneity remain computationally challenging or
intractable for an increasing fraction of contemporary datasets. Here, we
introduce "compositional constraint analysis", a method to
investigate the effect of site-specific amino acid diversity on
phylogenetic inference, and show that more constrained sites with lower
diversity and less constrained sites with higher diversity exhibit
ostensibly conflicting signal under models ignoring across-site
compositional heterogeneity and thus contribute to topological bias and
long branch attraction artifacts. We demonstrate that more
complex models accounting for across-site compositional heterogeneity can
ameliorate this bias. We present CAT-PMSF, a pipeline for diagnosing and
resolving phylogenetic bias resulting from inadequate modeling of
across-site compositional heterogeneity based on the CAT model. CAT-PMSF
is robust against long branch attraction in all alignments we have
examined. We suggest using CAT-PMSF when convergence of the CAT model
cannot be assured. We find evidence that compositionally constrained sites
are driving long branch attraction in two metazoan datasets and recover
evidence for Porifera as the sister group to all other animals.
提供机构:
Dryad
创建时间:
2023-03-17



