five

Data from: Resolving rapid radiations within angiosperm families using anchored phylogenomics

收藏
DataONE2017-05-01 更新2024-06-26 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈
官方服务:
资源简介:
Despite the promise that molecular data would provide a seemingly unlimited source of independent characters, many plant phylogenetic studies are still based on only two regions, the plastid genome and nuclear ribosomal DNA (nrDNA). Their popularity can be explained by high copy numbers and universal PCR primers that make their sequences easily amplified and converted into parallel datasets. Unfortunately, their utility is limited by linked loci and limited characters resulting in low confidence in the accuracy of phylogenetic estimates, especially when rapid radiations occur. In another contribution on anchored phylogenomics in angiosperms, we presented flowering plant-specific anchored enrichment probes for hundreds of conserved nuclear genes and demonstrated their use at the level of all angiosperms. In this contribution, we focus on a common problem in phylogenetic reconstructions below the family level: weak or unresolved backbone due to rapid radiations (≤10 million years) followed by long divergence, using the Cariceae-Dulichieae-Scirpeae clade (CDS, Cyperaceae) as a test case. By comparing our nuclear matrix of 461 genes to a typical Sanger-sequence dataset consisting of a few plastid genes (matK, ndhF) and an nrDNA marker (ETS), we demonstrate that our nuclear data is fully compatible with the Sanger dataset and resolves short backbone internodes with high support in both concatenated and coalescence-based analyses. In addition, we show that nuclear gene tree incongruence is inversely proportional to phylogenetic information content, indicating that incongruence is mostly due to gene tree estimation error. This suggests that large numbers of conserved nuclear loci could produce more accurate trees than sampling rapidly evolving regions prone to saturation and long-branch attraction. The robust phylogenetic estimates obtained here, and high congruence with previous morphological and molecular analyses, are strong evidence for a complete tribal revision of CDS. The anchored hybrid enrichment probes used in this study should be similarly effective in other flowering plant groups.

尽管分子数据曾被寄予厚望,可提供近乎无限的独立性状来源,但当前多数植物系统发育研究仍仅依托两个基因区域:质体基因组(plastid genome)与核核糖体DNA(nuclear ribosomal DNA,nrDNA)。这两类区域之所以被广泛使用,源于其高拷贝数与通用聚合酶链式反应(PCR)引物,可轻松实现序列扩增并转化为并行数据集。然而,这类区域的应用仍存在局限:其基因位点存在连锁效应,且可获取的性状数量有限,导致系统发育推断结果的准确性置信度偏低,在物种快速辐射演化场景下这一问题尤为突出。在另一项关于被子植物(angiosperms)锚定系统基因组学的研究中,我们开发了针对数百个保守核基因的被子植物特异性锚定富集探针(anchored enrichment probes),并验证了其在所有被子植物类群中的适用性。在本研究中,我们以薹草族-多花莎草族-藨草族演化支(CDS,莎草科(Cyperaceae))作为测试类群,聚焦科下系统发育重建中的一类常见问题:由≤1000万年的快速辐射演化伴随长期分化所导致的系统发育主干支持度不足或分辨率缺失。我们将包含461个核基因的核基因矩阵与典型的桑格测序(Sanger-sequence)数据集进行比对:后者仅包含少量质体基因(matK、ndhF)与1个nrDNA标记——外部转录间隔区(external transcribed spacer,ETS)。结果表明,本研究的核数据与桑格测序数据集完全兼容,且在串联分析与基于溯祖理论的分析中均能以高支持度解析较短的系统发育主干支系节点。此外,我们发现核基因树的冲突程度与系统发育信息含量呈负相关,这表明基因树冲突主要源于基因树推断误差。这一结果暗示,相较于选取易发生序列饱和与长枝吸引(long-branch attraction)的快速演化区域,大量保守核基因位点可构建出准确度更高的系统发育树。本研究获得的高置信度系统发育推断结果,以及与此前形态学和分子研究的高度一致性,为CDS类群的全面族级分类修订提供了有力依据。本研究使用的锚定杂交富集探针,在其他被子植物类群中应同样具备良好的应用效果。
创建时间:
2017-05-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作