Data from: Using phylogenomics to resolve mega-families: an example from Compositae
收藏DataONE2015-07-22 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈官方服务:
资源简介:
Next-generation sequencing and phylogenomics hold great promise for elucidating complex relationships among large plant families. Here we performed targeted capture of low copy sequences followed by next-generation sequencing on the Illumina platform in the large and diverse angiosperm family Compositae (Asteraceae). The family is monophyletic based on morphology and molecular data, yet many areas of the phylogeny have unresolved polytomies and interpreting phylogenetic patterns has been historically difficult. In order to outline a method and provide a framework and for future phylogenetic studies in the Compositae, we sequenced 23 taxa from across the family in which the relationships were well established as well as a member of the sister family Calyceraceae. We generated nuclear data from 795 loci and assembled chloroplast genomes from off-target capture reads enabling the comparison of nuclear and chloroplast genomes for phylogenetic analyses. We also analyzed multi-copy nuclear genes in our data set using a clustering method during orthology detection, and we applied a network approach to these clusters—analyzing all related locus copies. Using these data we produced hypotheses of phylogenetic relationships employing both a conservative (restricted to only loci with one copy per targeted locus) and a multigene approach (including all copies per targeted locus). The methods and bioinformatics workflow presented here provide a solid foundation for future work aimed at understanding gene family evolution in the Compositae as well as providing a model for phylogenomic analyses in other plant mega-families.
新一代测序(Next-generation Sequencing)与系统发育基因组学(Phylogenomics)在阐明大型植物科间复杂演化关系方面极具应用前景。本研究针对物种多样的大型被子植物科菊科(Compositae, Asteraceae)开展了低拷贝序列(Low copy sequences)靶向捕获(Targeted capture),并依托Illumina平台(Illumina platform)完成后续新一代测序。尽管基于形态学与分子数据已证实该科为单系群(Monophyletic),但系统发育树中仍存在诸多未解的多歧支(Polytomy),且长期以来系统发育模式的解析颇具挑战。为明确菊科系统发育研究的方法路径并搭建研究框架,本研究对该科内23个演化关系已明确的分类群(Taxa)以及姊妹科萼角花科(Calyceraceae)的1个类群开展了测序。我们从795个基因座获取了核基因数据,并通过脱靶捕获读段(Off-target capture reads)组装得到叶绿体基因组(Chloroplast genomes),可用于核基因组与叶绿体基因组的系统发育分析比较。此外,本研究在直系同源(Orthology)基因检测环节采用聚类方法(Clustering method)对数据集内的多拷贝核基因进行分析,并针对这些聚类结果应用网络分析方法(Network approach),以解析所有相关基因座的拷贝情况。基于上述数据,本研究分别采用保守策略(仅保留每个靶向基因座仅含单拷贝的位点)与多基因策略(纳入每个靶向基因座的所有拷贝)构建了系统发育关系假说。本研究提出的方法与生物信息学流程(Bioinformatics workflow),不仅为解析菊科的基因家族演化提供了坚实的研究基础,也为其他大型植物科的系统发育基因组学分析提供了参考范式。
创建时间:
2015-07-22



