five

Data from: Resolving rapid radiations within angiosperm families using anchored phylogenomics

收藏
DataONE2017-05-01 更新2024-06-26 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈
官方服务:
资源简介:
Despite the promise that molecular data would provide a seemingly unlimited source of independent characters, many plant phylogenetic studies are still based on only two regions, the plastid genome and nuclear ribosomal DNA (nrDNA). Their popularity can be explained by high copy numbers and universal PCR primers that make their sequences easily amplified and converted into parallel datasets. Unfortunately, their utility is limited by linked loci and limited characters resulting in low confidence in the accuracy of phylogenetic estimates, especially when rapid radiations occur. In another contribution on anchored phylogenomics in angiosperms, we presented flowering plant-specific anchored enrichment probes for hundreds of conserved nuclear genes and demonstrated their use at the level of all angiosperms. In this contribution, we focus on a common problem in phylogenetic reconstructions below the family level: weak or unresolved backbone due to rapid radiations (≤10 million years) followed by long divergence, using the Cariceae-Dulichieae-Scirpeae clade (CDS, Cyperaceae) as a test case. By comparing our nuclear matrix of 461 genes to a typical Sanger-sequence dataset consisting of a few plastid genes (matK, ndhF) and an nrDNA marker (ETS), we demonstrate that our nuclear data is fully compatible with the Sanger dataset and resolves short backbone internodes with high support in both concatenated and coalescence-based analyses. In addition, we show that nuclear gene tree incongruence is inversely proportional to phylogenetic information content, indicating that incongruence is mostly due to gene tree estimation error. This suggests that large numbers of conserved nuclear loci could produce more accurate trees than sampling rapidly evolving regions prone to saturation and long-branch attraction. The robust phylogenetic estimates obtained here, and high congruence with previous morphological and molecular analyses, are strong evidence for a complete tribal revision of CDS. The anchored hybrid enrichment probes used in this study should be similarly effective in other flowering plant groups.

尽管分子数据本可提供近乎无限的独立性状来源,但当前多数植物系统发育研究仍仅依托两类基因区域:质体基因组(plastid genome)与核核糖体DNA(nuclear ribosomal DNA, nrDNA)。这两类区域得以广泛应用,原因在于其拷贝数较高,且存在通用聚合酶链式反应(Polymerase Chain Reaction, PCR)引物,可轻松扩增其序列并构建平行数据集。但此类区域的应用存在固有局限:其基因座彼此连锁,且可供分析的性状数量有限,导致系统发育推断的准确性置信度不足,在物种快速辐射演化场景下这一问题尤为突出。在此前一项关于被子植物(angiosperms)锚定系统发育基因组学(anchored phylogenomics)的研究中,我们开发了针对数百个保守核基因的被子植物特异性锚定富集探针(anchored enrichment probes),并验证了其在全被子植物类群中的应用效果。本研究则聚焦科下等级系统发育重建中的一类常见难题:因≤1000万年的快速辐射演化伴随后续长趋异事件,导致系统发育骨架支撑薄弱或无法解析,并以薹草族-甜茅族-藨草族支系(Cariceae-Dulichieae-Scirpeae clade, CDS, 莎草科Cyperaceae)作为测试类群。本研究将包含461个核基因的核基因矩阵,与典型的桑格测序(Sanger-sequence)数据集进行对比:后者仅涵盖少数质体基因(matK、ndhF)与一个核核糖体DNA标记(ETS)。结果显示,本研究的核数据与桑格测序数据集完全兼容,且在串联分析(concatenated analyses)与溯祖分析(coalescence-based analyses)中,均能以较高支持度解析较短的系统发育骨架分支。此外,我们发现核基因树冲突程度与系统发育信息含量呈负相关,这表明基因树冲突主要源于基因树推断误差。这一结果表明,相较于选取易发生序列饱和与长枝吸引(long-branch attraction)的快速演化区域,大量保守核基因座可构建出更为准确的系统发育树。本研究获得的高置信度系统发育推断结果,以及与此前形态学和分子研究的高度一致性,为CDS支系的全面族级分类修订提供了有力证据。本研究使用的锚定杂交富集探针,在其他被子植物类群中也应具备类似的应用效果。
创建时间:
2017-05-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作