five

Data from: Rumbling orchids: how to assess divergent evolution between chloroplast endosymbionts and the nuclear host

收藏
DataONE2015-10-01 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈
官方服务:
资源简介:
Phylogenetic relationships inferred from multilocus organellar and nuclear DNA data are often difficult to resolve because of evolutionary conflicts among gene trees. However, conflicting or “outlier” associations (i.e., linked pairs of “operational terminal units” in two phylogenies) among these data sets often provide valuable information on evolutionary processes such as chloroplast capture following hybridization, incomplete lineage sorting, and horizontal gene transfer. Statistical tools that to date have been used in cophylogenetic studies only also have the potential to test for the degree of topological congruence between organellar and nuclear data sets and reliably detect outlier associations. Two distance-based methods, namely ParaFit and Procrustean Approach to Cophylogeny (PACo), were used in conjunction to detect those outliers contributing to conflicting phylogenies independently derived from chloroplast and nuclear sequence data. We explored their efficiency of retrieving outlier associations, and the impact of input data (unit branch length and additive trees) between data sets, by using several simulation approaches. To test their performance using real data sets, we additionally inferred the phylogenetic relationships within Neotropical Catasetinae (Epidendroideae, Orchidaceae), which is a suitable group to investigate phylogenetic incongruence because of hybridization processes between some of its constituent species. A comparison between trees derived from chloroplast and nuclear sequence data reflected strong, well-supported incongruence within Catasetum, Cycnoches, and Mormodes. As a result, outliers among chloroplast and nuclear data sets, and in experimental simulations, were successfully detected by PACo when using patristic distance matrices obtained from phylograms, but not from unit branch length trees. The performance of ParaFit was overall inferior compared to PACo, using either phylograms or unit branch lengths as input data. Because workflows for applying cophylogenetic analyses are not standardized yet, we provide a pipeline for executing PACo and ParaFit as well as displaying outlier associations in plots and trees by using the software R. The pipeline renders a method to identify outliers with high reliability and to assess the combinability of the independently derived data sets by means of statistical analyses.

基于多位点细胞器DNA与核DNA数据推断的系统发育关系,往往因基因树间的进化冲突而难以解析。然而,这些数据集间存在的冲突或‘异常’关联(即两个系统发育树中配对的‘操作末端单元(operational terminal units)’),往往能为诸如杂交后的叶绿体捕获(chloroplast capture)、不完全谱系分选(incomplete lineage sorting)以及水平基因转移(horizontal gene transfer)等进化过程提供宝贵信息。迄今为止仅用于共系统发育研究(cophylogenetic studies)的统计工具,同样可用于检验细胞器与核数据集之间的拓扑一致性程度,并可靠地检测异常关联。本研究联合使用两种基于距离的方法,即ParaFit与普克鲁克斯特共系统发育方法(Procrustean Approach to Cophylogeny, PACo),以检测那些导致叶绿体与核序列数据独立构建的系统发育树产生冲突的异常位点。我们通过多种模拟方法,探究了这两种方法检索异常关联的效能,以及输入数据(单位分支长度与加性树)对数据集间分析结果的影响。为验证其在真实数据集上的表现,我们还对新热带卡特兰亚族(树兰亚科,兰科)内部的系统发育关系进行了推断;该类群因部分组成物种间存在杂交过程,是研究系统发育不一致性的理想类群。对叶绿体与核序列数据构建的系统树进行比较后发现,卡特兰属(Catasetum)、天鹅兰属(Cycnoches)以及莫莫兰属(Mormodes)内部存在显著且支持度良好的系统发育不一致性。结果表明,当使用由有分支长度系统发育树(phylograms)得到的支系距离矩阵(patristic distance matrices)时,PACo可成功检测到叶绿体与核数据集间的异常关联,以及实验模拟中的异常位点;而基于单位分支长度树的分析则无法实现这一点。无论使用有分支长度系统发育树还是单位分支长度树作为输入数据,ParaFit的整体表现均劣于PACo。由于当前共系统发育分析的工作流程尚未标准化,我们提供了一套可执行PACo与ParaFit分析的流程,并可通过R软件在绘图与系统树中展示异常关联。该流程提供了一种高可靠性识别异常关联的方法,并可通过统计分析评估独立构建数据集的可组合性。
创建时间:
2015-10-01
二维码
社区交流群
二维码
科研交流群
商业服务