Data from: Probabilistic methods outperform parsimony in the phylogenetic analysis of data simulated without a probabilistic model

DataONE2018-08-20 更新2024-06-08 收录

下载链接：

https://search.dataone.org/view/null

下载链接

链接失效反馈

官方服务：

资源简介：

In order to understand patterns and processes of the diversification of life we require an accurate understanding of taxa interrelationships. Recent studies have suggested that analyses of morphological character data using the Bayesian and Maximum likelihood Mk model provide phylogenies of higher accuracy compared to parsimony methods. These studies have proved controversial, particularly simulating morphology-data under Markov models that assume shared branch lengths for characters, as it is claimed this leads to bias favouring the Bayesian or Maximum likelihood Mk model over parsimony models which do not explicitly make this assumption. We avoid these potential issues by employing a simulation protocol in which character states are randomly assigned to tips, but datasets are constrained to an empirically-realistic distribution of homoplasy as measured by the Consistency Index. Datasets were analysed with equal-weights and implied weights parsimony, and the Maximum Likelihood and Bayesian Mk model. We find that consistent (low homoplasy) datasets render method choice largely irrelevant, as all methods perform well with high consistency (low homoplasy) datasets, but the largest discrepancies in accuracy occur with low consistency datasets (high homoplasy). In such cases, the Bayesian Mk model is significantly more accurate than alternative models, and Implied weights parsimony never significantly out-performs the Bayesian Mk model. When poorly-supported branches are collapsed, the Bayesian Mk model recovers trees with higher resolution compared to other methods. Since it is not possible to assess homoplasy independently of a tree estimate, the Bayesian Mk model emerges as the most reliable method for categorical morphological analyses.

为理解生命多样化的模式与过程，我们需要准确掌握各类群间的亲缘关系。近期研究表明，相较于简约法，采用贝叶斯及最大似然Mk模型（Bayesian and Maximum Likelihood Mk model）分析形态学性状数据所得到的系统发育树精度更高。此类研究颇具争议，尤其是在以假设性状共享分支长度的马尔可夫模型模拟形态学数据时——有观点指出，这会产生偏向贝叶斯或最大似然Mk模型的偏差，而简约模型并未明确作出该假设。为规避上述潜在问题，本研究采用了一套模拟方案：将性状状态随机分配至末梢类群，但需将数据集约束至通过一致性指数（Consistency Index）衡量的、符合经验现实的同塑性状分布。本研究分别采用等权简约法、隐含加权简约法、最大似然Mk模型以及贝叶斯Mk模型对数据集进行分析。结果显示，一致性较高（同塑程度较低）的数据集在很大程度上不受方法选择的影响：所有方法在高一致性（低同塑）数据集上均表现良好；但在一致性较低（同塑程度较高）的数据集上，各方法的精度差异最为显著。在此类场景下，贝叶斯Mk模型的精度显著优于其他模型，且隐含加权简约法从未显著优于贝叶斯Mk模型。当剪除支持度较低的分支时，贝叶斯Mk模型所恢复的系统发育树分辨率高于其他方法。由于无法脱离系统发育树估计独立评估同塑性状程度，贝叶斯Mk模型成为分类形态学分析中最为可靠的方法。

创建时间：

2018-08-20

5,000+

优质数据集

54 个

任务类型

进入经典数据集