Data from: Terrace aware data structure for phylogenomic inference from supermatrices
收藏DataONE2016-04-21 更新2024-06-26 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈官方服务:
资源简介:
In phylogenomics the analysis of concatenated gene alignments, the so-called supermatrix, is commonly accompanied by the assumption of partition models. Under such models each gene, or more generally partition, is allowed to evolve under its own evolutionary model. Though partition models provide a more comprehensive analysis of supermatrices, missing data may hamper the tree search algorithms due to the existence of phylogenetic (partial) terraces. Here we introduce the phylogenetic terrace aware (PTA) data structure for the efficient analysis under partition models. In the presence of missing data PTA exploits (partial) terraces and induced partition trees to save computation time. We show that an implementation of PTA in IQ-TREE leads to a substantial speedup of up to 4.5 and 8 times compared with the standard IQ-TREE and RAxML implementations, respectively. PTA is generally applicable to all types of partition models and common topological rearrangements thus can be employed by all phylogenomic inference software.
在系统发育基因组学(phylogenomics)领域中,针对串联基因比对(concatenated gene alignments)——即所谓的超矩阵(supermatrix)——的分析通常会附带分区模型(partition models)的假设前提。在该类模型框架下,每个基因(或更宽泛意义上的每个分区)均可在其专属的进化模型下发生演化。尽管分区模型能够为超矩阵分析提供更为全面的视角,但缺失数据会因系统发育(部分)平台区(phylogenetic (partial) terraces)的存在,而对树搜索算法的运行造成阻碍。本文提出了面向分区模型下高效分析的系统发育平台区感知(phylogenetic terrace aware,简称PTA)数据结构。在存在缺失数据的场景下,PTA可利用(部分)平台区与诱导分区树(induced partition trees)来节省计算耗时。研究表明,在IQ-TREE中实现PTA后,相较于标准IQ-TREE与RAxML的实现版本,其运行速度可分别获得最高4.5倍与8倍的显著提升。PTA可普遍适配所有类型的分区模型与常见拓扑重排(topological rearrangements),因此可被所有系统发育基因组学推断软件(phylogenomic inference software)所采用。
创建时间:
2016-04-21



