Data from: Phylogenetic factor analysis
收藏DataONE2017-08-03 更新2024-06-26 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈官方服务:
资源简介:
Phylogenetic comparative methods explore the relationships between quantitative traits adjusting for shared evolutionary history. This adjustment often occurs through a Brownian diffusion process along the branches of the phylogeny that generates model residuals or the traits themselves. For high-dimensional traits, inferring all pair-wise correlations within the multivariate diffusion is limiting. To circumvent this problem, we propose phylogenetic factor analysis (PFA) that assumes a small unknown number of independent evolutionary factors arise along the phylogeny and these factors generate clusters of dependent traits. Set in a Bayesian framework, PFA provides measures of uncertainty on the factor number and groupings, combines both continuous and discrete traits, integrates over missing measurements and incorporates phylogenetic uncertainty with the help of molecular sequences. We develop Gibbs samplers based on dynamic programming to estimate the PFA posterior distribution, over three-fold faster than for multivariate diffusion and a further order-of-magnitude more efficiently in the presence of latent traits. We further propose a novel marginal likelihood estimator for previously impractical models with discrete data and find that PFA also provides a better fit than multivariate diffusion in evolutionary questions in columbine flower development, placental reproduction transitions and triggerfish fin morphometry.
系统发育比较方法(Phylogenetic Comparative Methods)旨在探究数量性状间的关联,并对共有的进化历史进行校正。这类校正通常通过沿系统发育分支的布朗运动扩散过程(Brownian diffusion process)完成,该过程可生成模型残差或性状本身。针对高维性状而言,在多变量扩散模型中推断全部两两相关性存在局限。为规避这一问题,本文提出系统发育因子分析(Phylogenetic Factor Analysis, PFA)模型,其假设沿系统发育可产生少量未知且独立的进化因子,且这些因子可生成具有相关性的性状簇。该模型基于贝叶斯框架(Bayesian framework)构建,可提供因子数量与分组的不确定性度量,能够兼容连续与离散两类性状,可对缺失测量值进行积分,并借助分子序列整合系统发育的不确定性。我们开发了基于动态规划(Dynamic Programming)的吉布斯采样器(Gibbs Sampler)以估计PFA的后验分布(posterior distribution),其运算速度较多变量扩散模型快三倍以上,在存在潜在性状的场景下,效率还能再提升一个数量级。此外,我们针对此前难以处理的离散数据模型提出了一种新型边际似然估计器(marginal likelihood estimator),并发现:在耧斗菜花朵发育、胎盘生殖演化以及鳞鲀鳍形态计量学等演化研究问题中,PFA模型的拟合效果均优于多变量扩散模型。
创建时间:
2017-08-03



