Data from: The effect of gene flow on coalescent-based species-tree inference
收藏Mendeley Data2024-06-25 更新2024-06-27 收录
下载链接:
https://zenodo.org/records/4984251
下载链接
链接失效反馈官方服务:
资源简介:
Most current methods for inferring species-level phylogenies under the coalescent model assume that no gene flow occurs following speciation. Several studies have examined the impact of gene flow (e.g., Eckert and Carstens (2008); Chung and Ane (2011); Leache et al. (2014); Solis-Lemus et al. (2016)) and of ancestral population structure (DeGeorgio and Rosenberg, 2016) on the performance of species-level phylogenetic inference, and analytic results have been proven for network models of gene flow (e.g., Solis-Lemus et al. (2016); Zhu et al. (2016)). However, there are few analytic results for a continuous model of gene flow following speciation, despite the development of mathematical tools that could facilitate such study (e.g., Hobolth et al. (2011); Andersen et al. (2014); Tian and Kubatko (2016)). In this paper, we consider a three-taxon isolation-with-migration model that allows gene flow between sister taxa for a brief period following speciation, as well as variation in the effective population sizes across the species tree. We derive the probabilities of each of the three gene tree topologies under this model, and show that for certain choices of the gene flow and effective population size parameters, anomalous gene trees (i.e., gene trees that are discordant with the species tree but that have higher probability than the gene tree concor- dant with the species tree) exist. We characterize the region of parameter space producing anomalous trees, and show that the probability of the gene tree that is concordant with the species tree can be arbitrarily small. We then show that there is theoretical support for using SVDQuartets with an outgroup to infer the rooted three-taxon species tree in a model of gene flow between sister taxa. We study the performance of SVDQuartets on simulated data and compare it to three other commonly-used methods for species tree inference, AS- TRAL, MP-EST, and concatenation. The simulations show that ASTRAL, MP-EST, and concatenation can be statistically inconsistent when gene flow is present, while SVDQuartets performs well, though large sample sizes may be required for certain parameter choices.
当前基于溯祖模型(coalescent model)推断物种水平系统发育树的多数方法,均假定物种形成后不存在基因流(gene flow)。已有多项研究探讨了基因流(如Eckert与Carstens,2008;Chung与Ane,2011;Leache等,2014;Solis-Lemus等,2016)以及祖先种群结构(DeGeorgio与Rosenberg,2016)对物种水平系统发育推断性能的影响,且针对基因流网络模型已推导得到解析结果(如Solis-Lemus等,2016;Zhu等,2016)。然而,尽管已有可助力此类研究的数学工具被开发(如Hobolth等,2011;Andersen等,2014;Tian与Kubatko,2016),针对物种形成后连续型基因流模型的解析结果却较为匮乏。本文针对三分类群隔离与迁移模型(isolation-with-migration model)展开研究,该模型允许物种形成后短时间内姊妹类群间发生基因流,同时允许物种树内有效种群大小(effective population size)存在变异。本文推导了该模型下三种基因树拓扑结构各自的概率,并证明:当基因流与有效种群大小参数取特定值时,会存在异常基因树(即与物种树不一致,但概率高于与物种树一致的基因树的基因树)。本文刻画了产生异常基因树的参数空间区域,并证明与物种树一致的基因树的概率可以任意小。随后,本文证明了在姊妹类群间存在基因流的模型中,使用带外类群(outgroup)的SVDQuartets方法推断有根三分类群物种树具备理论依据。本文通过模拟数据评估了SVDQuartets的性能,并将其与另外三种常用的物种树推断方法——ASTRAL、MP-EST以及联配分析(concatenation)——进行了对比。模拟结果表明:当存在基因流时,ASTRAL、MP-EST与联配分析可能出现统计上的不一致性,而SVDQuartets表现良好,但针对某些参数组合可能需要较大的样本量。
创建时间:
2023-06-28



