Data from: A generalized K statistic for estimating phylogenetic signal from shape and other high-dimensional multivariate data
收藏DataONE2014-04-22 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈官方服务:
资源简介:
Phylogenetic signal is the tendency for closely related species to display similar trait values due to their common ancestry. Several methods have been developed for quantifying phylogenetic signal in univariate traits and for sets of traits treated simultaneously, and the statistical properties of these approaches have been extensively studied. However, methods for assessing phylogenetic signal in high-dimensional multivariate traits like shape are less well developed, and their statistical performance is not well characterized. In this article, I describe a generalization of the K statistic of Blomberg et al. (2003) that is useful for quantifying and evaluating phylogenetic signal in highly-dimensional multivariate data. The method (Kmult) is found from the equivalency between statistical methods based on covariance matrices and those based on distance matrices. Using computer simulations based on Brownian motion, I demonstrate that the expected value of Kmult remains at 1.0 as trait variation among species is increased or decreased, and as the number of trait dimensions is increased. By contrast, estimates of phylogenetic signal found with a squared-change parsimony procedure for multivariate data change with increasing trait variation among species and with increasing numbers of trait dimensions, confounding biological interpretations. I also evaluate the statistical performance of hypothesis testing procedures based on Kmult and find that the method displays appropriate Type I error and high statistical power for detecting phylogenetic signal in high-dimensional data. Statistical properties of Kmult were consistent for simulations using bifurcating and random phylogenies, for simulations using different numbers of species, for simulations that varied the number of trait dimensions, and for different underlying models of trait covariance structure. Overall these findings demonstrate that Kmult provides a useful means of evaluating phylogenetic signal in high-dimensional multivariate traits. Finally, I illustrate the utility of the new approach by evaluating the strength of phylogenetic signal for head shape in a lineage of Plethodon salamanders.
系统发育信号(phylogenetic signal)指亲缘关系较近的物种因拥有共同祖先而表现出相似性状值的倾向。目前学界已开发出多种量化单变量性状以及同时处理多性状集合的系统发育信号的方法,且这些方法的统计性质已得到广泛研究。然而,针对形状等高维多元性状的系统发育信号评估方法仍不够成熟,其统计性能也未得到充分刻画。本文中,作者提出了Blomberg等人(2003)提出的K统计量的推广形式,该方法可用于量化和评估高维多元数据中的系统发育信号,记为Kmult。Kmult基于协方差矩阵法与距离矩阵法的等价性推导得出。通过基于布朗运动的计算机模拟,作者证实:无论物种间性状变异程度升高或降低,抑或是性状维度数增加,Kmult的期望值始终维持在1.0。与之相对,基于平方变化简约法的多元性状系统发育信号估计值会随物种间性状变异程度提升以及性状维度数增加而发生变化,这会干扰生物学意义的解读。作者还评估了基于Kmult的假设检验程序的统计性能,发现该方法在高维数据中检测系统发育信号时,可实现恰当的I类错误控制,并具备较高的统计效力。Kmult的统计性质在多种模拟场景中均保持一致:包括使用分叉型与随机型系统发育树的模拟、不同物种数量的模拟、改变性状维度数的模拟,以及采用不同性状协方差结构基础模型的模拟。综上,上述结果表明Kmult为评估高维多元性状的系统发育信号提供了一种有效的途径。最后,作者通过评估钝口螈属(Plethodon)蝾螈类群中头部形状的系统发育信号强度,阐明了该新方法的实用价值。
创建时间:
2014-04-22



