Data from: From algae to angiosperms–inferring the phylogeny of green plants (Viridiplantae) from 360 plastid genomes
收藏DataONE2014-02-18 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈官方服务:
资源简介:
Background: Next-generation sequencing has provided a wealth of plastid genome sequence data from an increasingly diverse set of green plants (Viridiplantae). Although these data have been useful for reconstructing the phylogeny of numerous clades of photosynthetic organisms (e.g., green algae, angiosperms, and gymnosperms), their utility for inferring relationships across all green plants is uncertain. Viridiplantae originated 700-1500 million years ago and may comprise as many as 500,000 species. This clade represents a major source of photosynthetic carbon and contains an immense diversity of life forms, including some of the smallest and largest eukaryotes. Here we explore the limits and challenges of inferring a comprehensive green plant phylogeny from available complete or nearly complete plastid genome data. Results: We assembled protein-coding sequence data for 78 genes from 360 diverse green plant taxa with complete or nearly complete plastid genome sequences available from GenBank. Phylogenetic analyses of the plastid data recovered well-supported backbone relationships and strong support for relationships that were not observed in previous analyses of major subclades within Viridiplantae. However, there also is evidence of systematic error in some analyses. In several instances we obtained strongly supported but conflicting topologies from analyses of nucleotides versus amino acid characters, and the considerable variation in GC content among lineages and within single genomes affected the phylogenetic placement of several taxa. Conclusions: Analyses of the plastid data recovered a strongly supported framework of relationships for green plants. This includes the placement of Zygnematophyceace as sister to land plants (Embryophyta) and a clade of extant gymnosperms (Acrogymnospermae) with cycads + Ginkgo sister to remaining members and with gnetophytes (Gnetophyta) sister to non-Pinaceae conifers (Gnecup trees); within the monilophyte clade (Monilophyta), relationships are strongly supported with Equisetales + Psilotales sister to Marattiales + leptosporangiate ferns. We also highlight the challenges of using plastid genome sequences in deep-level phylogenomic analyses and provide suggestions for future analyses that will likely incorporate plastid genome data for thousands of species. We particularly emphasize the importance of exploring the effects of different partitioning and character coding protocols for the entire data set as well as subsets of the data.
背景:下一代测序(Next-generation sequencing)技术已从日益多样化的绿色植物(Viridiplantae)中获取了海量的质体基因组(plastid genome)序列数据。尽管这些数据已被用于重构众多光合生物(photosynthetic organisms)演化支(clade)的系统发育关系,例如绿藻、被子植物与裸子植物,但它们在解析整个绿色植物界的跨类群演化关系方面的适用性仍未明确。绿色植物界起源于7亿至15亿年前,现存物种数量可能多达50万种。该演化支是光合碳固定的主要来源,拥有极其丰富的生命形式多样性,涵盖了部分已知体积最小与最大的真核生物。本研究旨在探究利用现有完整或近乎完整的质体基因组数据,构建全面绿色植物系统发育树所面临的局限性与挑战。
结果:我们从基因银行(GenBank)中获取了360个具有完整或近乎完整质体基因组序列的多样绿色植物分类单元(taxa)的数据,组装得到78个蛋白编码基因的序列数据集。针对该质体基因组数据集的系统发育分析(phylogenetic analyses)不仅恢复了支持度较高的主干演化关系,还为此前在绿色植物界主要亚支系分析中未被检测到的演化关系提供了强有力的支持。不过,部分分析中也存在系统误差(systematic error)的迹象。在多个案例中,基于核苷酸特征与氨基酸特征的分析得到了支持度较高但拓扑结构相互冲突的结果;同时,不同谱系间以及单个基因组内显著的GC含量(GC content)差异,影响了部分分类单元的系统发育位置(phylogenetic placement)推断。
结论:质体基因组数据的系统发育分析构建了支持度良好的绿色植物演化关系框架,其中包括将双星藻科(Zygnematophyceace)确定为陆生植物(Embryophyta,有胚植物)的姊妹群,以及明确现存裸子植物演化支(Acrogymnospermae)的演化关系:苏铁类与银杏类构成其余现存裸子植物的姊妹群,买麻藤类(Gnetophyta)与非松科针叶树(Gnecup类群)互为姊妹群。在链形植物支系(Monilophyta)内部,木贼目(Equisetales)与松叶蕨目(Psilotales)构成莲座蕨目(Marattiales)与薄囊蕨类的姊妹群,该支系内的演化关系得到了强有力的支持。本研究同时强调了在深度系统发育基因组学(phylogenomics)分析中应用质体基因组序列所面临的挑战,并为未来整合数千个物种质体基因组数据的系统发育分析提供了建议,尤其强调了探究不同分区(partitioning)策略与特征编码(character coding)方案对全数据集及数据子集的影响的重要性。
创建时间:
2014-02-18



