Data from: A new hierarchy of phylogenetic models consistent with heterogeneous substitution rates

DataONE2015-04-10 更新2024-06-27 收录

下载链接：

https://search.dataone.org/view/null

下载链接

链接失效反馈

官方服务：

资源简介：

When the process underlying DNA substitutions varies across evolutionary history, some standard Markov models underlying phylogenetic methods are mathematically inconsistent. The most prominent example is the general time reversible model (GTR) together with some, but not all, of its submodels. To rectify this deficiency, Lie Markov models have been developed as the class of models that are consistent in the face of a changing process of DNA substitutions. Some well-known models in popular use are within this class, but are either overly simplistic (e.g. the Kimura two-parameter model) or overly complex (the general Markov model). On a diverse set of biological data sets, we test a hierarchy of Lie Markov models spanning the full range of parameter richness. Compared against the benchmark of the ever-popular GTR model, we find that as a whole the Lie Markov models perform well, with the best performing models having eight parameters and the ability to recognise the distinction between purines and pyrimidines.

当DNA替换的底层过程在进化历程中发生变化时，部分支撑系统发育方法的标准马尔可夫模型（Markov models）会在数学上呈现不一致性。最具代表性的案例为一般时间可逆模型（general time reversible model, GTR）及其部分（而非全部）子模型。为弥补这一缺陷，研究者开发了李马尔可夫模型（Lie Markov models），该类模型在DNA替换过程发生动态变化时仍可保持数学一致性。当前广泛应用的多款知名模型均隶属于该类别，但它们要么过于简化（例如木村双参数模型），要么过于复杂（例如一般马尔可夫模型）。我们依托多样化的生物数据集集群，对覆盖参数丰富度全谱系的层级化李马尔可夫模型开展了测试。以广受欢迎的GTR模型作为基准进行对比后，我们发现整体而言李马尔可夫模型表现优异，其中性能最优的模型仅包含8个参数，且能够准确区分嘌呤与嘧啶之间的差异。

创建时间：

2015-04-10