Data from: Coestimating reticulate phylogenies and gene trees from multilocus sequence data
收藏DataONE2017-10-26 更新2024-06-26 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈官方服务:
资源简介:
The multispecies network coalescent (MSNC) is a stochastic process that captures how gene trees grow within the branches of a phylogenetic network. Coupling the MSNC with a stochastic mutational process that operates along the branches of the gene trees gives rise to a generative model of how multiple loci from within and across species evolve in the presence of both incomplete lineage sorting (ILS) and reticulation (e.g., hybridization). We report on a Bayesian method for sampling the parameters of this generative model, including the species phylogeny, gene trees, divergence times, and population sizes, from DNA sequences of multiple independent loci. We demonstrate the utility of our method by analyzing simulated data and reanalyzing an empirical data set. Our results demonstrate the significance of not only co-estimating species phylogenies and gene trees, but also accounting for reticulation and ILS simultaneously. In particular, we show that when gene flow occurs, our method accurately estimates the evolutionary histories, coalescence times, and divergence times. Tree inference methods, on the other hand, underestimate divergence times and overestimate coalescence times when the evolutionary history is reticulate. While the MSNC corresponds to an abstract model of ``intermixture," we study the performance of the model and method on simulated data generated under a gene flow model. We show that the method accurately infers the most recent time at which gene flow occurs. Finally, we demonstrate the application of the new method to a 106-locus yeast data set.
多物种网络溯祖(multispecies network coalescent, MSNC)是一类随机过程,用以刻画基因树在系统发育网络分支内的演化规律。将MSNC与沿基因树分支运作的随机突变过程相结合,可得到一个生成式模型,用以描述物种内及跨物种的多个独立基因座在同时存在不完全谱系分选(incomplete lineage sorting, ILS)与网状演化(如杂交)情况下的演化历程。
本文报道了一种贝叶斯方法,可基于多个独立基因座的DNA序列样本,对该生成式模型的各类参数进行采样,其中包括物种系统发育、基因树、分歧时间以及种群大小。
我们通过分析模拟数据与重新分析一则实测数据集,验证了该方法的实用性。研究结果表明,不仅需要联合估计物种系统发育与基因树,同时还需同时考量网状演化与ILS的影响,这一点具有重要意义。
具体而言,当存在基因流时,本方法可准确估计演化历史、溯祖时间与分歧时间。而基于树的推断方法在处理网状演化历程时,则会低估分歧时间、高估溯祖时间。
尽管MSNC对应“混合”的抽象模型,我们仍在基于基因流模型生成的模拟数据上测试了该模型与方法的性能。结果表明,该方法可准确推断出最近一次基因流发生的时间。
最后,我们将该新方法应用于包含106个基因座的酵母数据集,展示了其实际应用价值。
创建时间:
2017-10-26



