five

The perfect storm: Gene tree estimation error, incomplete lineage sorting, and ancient gene flow explain the most recalcitrant ancient angiosperm clade, Malpighiales

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.hx3ffbgck
下载链接
链接失效反馈
官方服务:
资源简介:
The genomic revolution offers renewed hope of resolving rapid radiations in the Tree of Life. The development of the multispecies coalescent (MSC) model and  improved gene tree estimation methods can better accommodate gene tree heterogeneity caused by incomplete lineage sorting (ILS) and gene tree estimation error stemming from the short internal branches. However, the relative influence of these factors in species tree inference is not well understood. Using anchored hybrid enrichment, we generated a data set including 423 single-copy loci from 64 taxa representing 39 families to infer the species tree of the flowering plant order Malpighiales. This order includes nine of the top ten most unstable nodes in angiosperms, which have been hypothesized to arise from the rapid radiation during the Cretaceous. Here, we show that coalescent-based methods do not resolve the backbone of Malpighiales and concatenation methods yield inconsistent estimations, providing evidence that gene tree heterogeneity is high in this clade. Despite high levels of ILS and gene tree estimation error, our simulations demonstrate that these two factors alone are insufficient to explain the lack of resolution in this order. To explore this further, we examined triplet frequencies among empirical gene trees and discovered some of them deviated significantly from those attributed to ILS and estimation error, suggesting gene flow as an additional and previously unappreciated phenomenon promoting gene tree variation in Malpighiales. Finally, we applied a novel method to quantify the relative contribution of these three primary sources of gene tree heterogeneity and demonstrated that ILS, gene tree estimation error, and gene flow contributed to 15%, 52%, and 32% of the variation, respectively. Together, our results suggest that a perfect storm of factors likely influence this lack of resolution, and further indicate that recalcitrant phylogenetic relationships like the backbone of Malpighiales may be better represented as phylogenetic networks. Thus, reducing such groups solely to existing models that adhere strictly to bifurcating trees greatly oversimplifies reality, and obscures our ability to more clearly discern the process of evolution. Methods Supplementary materials for Cai et al., including supplementary notes, figures, and tables. Sequence alignment, gene trees, and species trees are also included.

基因组学革命为破解生命之树中的快速辐射演化难题带来了新的契机。多物种溯祖模型(multispecies coalescent, MSC)与改良的基因树推断方法,能够更好地适配由不完全谱系分选(incomplete lineage sorting, ILS)以及源自短内分支的基因树推断误差所引发的基因树异质性。然而,上述因素在物种树推断中的相对影响尚未得到充分阐释。本研究借助锚定杂交富集(anchored hybrid enrichment)技术,构建了涵盖39个科、64个类群的423个单拷贝基因座数据集,用以推断被子植物金虎尾目(Malpighiales)的物种树。该类群包含被子植物十大最不稳定节点中的九个,此前被认为起源于白垩纪时期的快速辐射演化事件。研究结果显示,基于溯祖模型的方法无法解析金虎尾目的主干系统发育关系,而串联法得到的推断结果也不一致,这表明该演化支内存在高水平的基因树异质性。尽管存在较高水平的不完全谱系分选与基因树推断误差,但我们的模拟实验证实,仅这两个因素不足以解释该类群系统发育分辨率不足的问题。为进一步探究这一现象,我们对实测基因树的三联体频率(triplet frequencies)进行了分析,发现部分频率显著偏离了仅由不完全谱系分选和推断误差所预期的结果,这提示基因流(gene flow)作为一种此前未被重视的额外现象,推动了金虎尾目中的基因树变异。最后,我们应用一种全新的方法量化了这三种主要基因树异质性来源的相对贡献,结果表明不完全谱系分选、基因树推断误差与基因流分别解释了15%、52%与32%的变异。综合来看,我们的研究结果表明,多种因素的协同作用可能导致了该类群的系统发育分辨率不足,同时进一步表明,诸如金虎尾目主干这类难以解析的系统发育关系,或许更适合以系统发育网络(phylogenetic networks)的形式进行表征。因此,若仅将此类类群局限于严格遵循分叉树的现有模型中,将极大地简化演化现实,阻碍我们更清晰地解析演化过程。 ## 方法 本补充材料隶属于蔡等人(Cai et al.)的研究,包含补充说明、插图与表格。数据集同时涵盖序列比对结果、基因树及物种树。
创建时间:
2020-10-29
二维码
社区交流群
二维码
科研交流群
商业服务