Data from: Resolving rapid radiations within angiosperm families using anchored phylogenomics
收藏DataONE2017-05-01 更新2024-06-26 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈官方服务:
资源简介:
Despite the promise that molecular data would provide a seemingly unlimited source of independent characters, many plant phylogenetic studies are still based on only two regions, the plastid genome and nuclear ribosomal DNA (nrDNA). Their popularity can be explained by high copy numbers and universal PCR primers that make their sequences easily amplified and converted into parallel datasets. Unfortunately, their utility is limited by linked loci and limited characters resulting in low confidence in the accuracy of phylogenetic estimates, especially when rapid radiations occur. In another contribution on anchored phylogenomics in angiosperms, we presented flowering plant-specific anchored enrichment probes for hundreds of conserved nuclear genes and demonstrated their use at the level of all angiosperms. In this contribution, we focus on a common problem in phylogenetic reconstructions below the family level: weak or unresolved backbone due to rapid radiations (≤10 million years) followed by long divergence, using the Cariceae-Dulichieae-Scirpeae clade (CDS, Cyperaceae) as a test case. By comparing our nuclear matrix of 461 genes to a typical Sanger-sequence dataset consisting of a few plastid genes (matK, ndhF) and an nrDNA marker (ETS), we demonstrate that our nuclear data is fully compatible with the Sanger dataset and resolves short backbone internodes with high support in both concatenated and coalescence-based analyses. In addition, we show that nuclear gene tree incongruence is inversely proportional to phylogenetic information content, indicating that incongruence is mostly due to gene tree estimation error. This suggests that large numbers of conserved nuclear loci could produce more accurate trees than sampling rapidly evolving regions prone to saturation and long-branch attraction. The robust phylogenetic estimates obtained here, and high congruence with previous morphological and molecular analyses, are strong evidence for a complete tribal revision of CDS. The anchored hybrid enrichment probes used in this study should be similarly effective in other flowering plant groups.
尽管分子数据曾被寄予厚望,可提供近乎无限的独立性状来源,但当前多数植物系统发育研究仍仅依托两个基因组区域:质体基因组(plastid genome)与核核糖体DNA(nuclear ribosomal DNA, nrDNA)。这两类区域之所以被广泛使用,得益于其高拷贝数与通用聚合酶链式反应(Polymerase Chain Reaction, PCR)引物,可轻松实现序列扩增并转化为平行数据集。但遗憾的是,这类区域的应用存在局限:其位点存在连锁效应,且可获取的性状数量有限,导致系统发育推断的准确性置信度较低,在物种快速辐射演化场景中这一问题尤为突出。在另一项关于被子植物(angiosperms)锚定系统发育基因组学(anchored phylogenomics)的研究中,我们开发了针对数百个保守核基因的被子植物专属锚定富集探针(anchored enrichment probes),并验证了其在整个被子植物类群中的适用性。本研究则聚焦科级以下类群系统发育重建中的一类常见难题:因≤1000万年的快速辐射演化后伴随长期分化,导致系统发育主干支持度不足或分辨率缺失,并以薹草族-扁莎草族-藨草族分支(Cariceae-Dulichieae-Scirpeae clade, CDS,莎草科Cyperaceae)作为测试类群。通过将我们包含461个基因的核基因矩阵与典型的桑格测序(Sanger-sequence)数据集进行对比——该数据集仅包含少量质体基因(matK、ndhF)以及一个核核糖体DNA标记(ETS)——我们验证了核基因数据与桑格测序数据集完全兼容,且在串联分析与基于溯祖理论(coalescent theory)的分析中,均能以高支持度解析较短的系统发育主干节点。此外,我们发现核基因树的不一致性与系统发育信息含量呈负相关,这表明基因树不一致性主要源于基因树推断误差。这一结果表明,相较于选取易发生序列饱和(sequence saturation)与长枝吸引(long-branch attraction)现象的快速演化区域,大量保守核基因位点可构建出更为准确的系统发育树。本研究获得的高置信度系统发育推断结果,以及与既往形态学和分子研究的高度一致性,为CDS类群的全面族级分类修订提供了强有力的证据支持。本研究中使用的锚定杂交富集探针,在其他被子植物类群中也应具备相似的应用效果。
创建时间:
2017-05-01



