five

Data from: The Impact of the Tree Prior on Molecular Dating of Data Sets Containing a Mixture of Inter- and Intraspecies Data

收藏
DataONE2016-10-18 更新2024-06-26 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈
官方服务:
资源简介:
In Bayesian phylogenetic analyses of genetic data, prior probability distributions need to be specified for the model parameters, including the tree. When Bayesian methods are used for molecular dating, available tree priors include those designed for species-level data, such as the pure-birth and birth-death priors, and coalescent-based priors designed for population-level data. However, molecular dating methods are frequently applied to data sets that include multiple individuals across multiple species. Such data sets violate the assumptions of both the speciation and coalescent-based tree priors, making it unclear which should be chosen and whether this choice can affect the estimation of node times. To investigate this problem, we used a simulation approach to produce data sets with different proportions of within- and between-species sampling under the multispecies coalescent model. These data sets were then analysed under pure-birth, birth-death, constant-size coalescent, and skyline coalescent tree priors. We also explored the ability of Bayesian model testing to select the best-performing priors. We confirmed the applicability of our results to empirical data sets from cetaceans, phocids, and coregonid whitefish. Estimates of node times were generally robust to the choice of tree prior, but some combinations of tree priors and sampling schemes led to large differences in the age estimates. In particular, the pure-birth tree prior frequently led to inaccurate estimates for data sets containing a mixture of inter- and intraspecific sampling, whereas the birth-death and skyline coalescent priors produced stable results across all scenarios. Model testing provided an adequate means of rejecting inappropriate tree priors. Our results suggest that tree priors do not strongly affect Bayesian molecular dating results in most cases, even when severely misspecified. However, the choice of tree prior can be significant for the accuracy of dating results in the case of data sets with mixed inter- and intraspecies sampling.

在遗传数据的贝叶斯系统发育分析(Bayesian phylogenetic analysis)中,需为包括系统发育树在内的模型参数指定先验概率分布(prior probability distribution)。当采用贝叶斯方法开展分子定年(molecular dating)时,可用的树先验(tree prior)包括针对物种种级数据设计的先验,如纯生先验(pure-birth prior)与出生-死亡先验(birth-death prior),以及针对种群级数据设计的基于溯祖的树先验(coalescent-based tree prior)。然而,分子定年方法常被应用于包含多物种种群内多个个体的数据集,此类数据集违背了物种形成模型与基于溯祖的树先验的假设,使得研究者难以抉择合适的先验,且无法确定先验选择是否会影响节点时间的估计。为探究此问题,本研究采用模拟方法,在多物种溯祖模型(multispecies coalescent model)下生成了不同种内与种间采样比例的数据集,随后分别采用纯生先验、出生-死亡先验、恒定种群大小溯祖(constant-size coalescent)以及天际线溯祖(skyline coalescent)四类树先验对上述数据集进行分析。本研究同时探究了贝叶斯模型检验(Bayesian model testing)用于筛选最优表现先验的能力,并验证了研究结果可推广至鲸类(Cetacea)、海豹科(Phocidae)以及白鲑属白鱼(Coregonid whitefish)的实证数据集。节点时间估计结果总体上对树先验的选择具有鲁棒性,但部分树先验与采样方案的组合会导致年龄估计结果出现显著偏差;具体而言,当数据集同时包含种间与种内采样时,纯生树先验常会产生不准确的估计结果,而出生-死亡先验与天际线溯祖先验在所有场景下均能生成稳定的结果。贝叶斯模型检验可作为筛选不合适树先验的有效手段。本研究结果表明,在大多数场景下,即使树先验存在严重的模型设定偏差,其对贝叶斯分子定年结果的影响通常并不显著,但当数据集同时包含种间与种内采样时,树先验的选择对定年结果的准确性会产生重要影响。
创建时间:
2016-10-18
二维码
社区交流群
二维码
科研交流群
商业服务