five

Data from: Accounting for uncertainty in gene tree estimation: summary-coalescent species tree inference in a challenging radiation of Australian lizards

收藏
DataONE2016-09-28 更新2024-06-26 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈
官方服务:
资源简介:
Accurate gene tree inference is an important aspect of species tree estimation in a summary-coalescent framework. Yet, in empirical studies, inferred gene trees differ in accuracy due to stochastic variation in phylogenetic signal between targeted loci. Empiricists should therefore examine the consistency of species tree inference, while accounting for the observed heterogeneity in gene tree resolution of phylogenomic datasets. Here, we assess the impact of gene tree estimation error on summary-coalescent species tree inference by screening ~2000 exonic loci based on gene tree resolution prior to phylogenetic inference. We focus on a phylogenetically challenging radiation of Australian lizards (genus Cryptoblepharus, Scincidae) and explore effects on topology and support. We identify a well-supported topology based on all loci and find that a relatively small number of high-resolution gene trees can be sufficient to converge on the same topology. Adding gene trees with decreasing resolution produced a generally consistent topology, and increased confidence for specific bipartitions that were poorly supported when using a small number of informative loci. This corroborates coalescent-based simulation studies that have highlighted the need for a large number of loci to confidently resolve challenging relationships and refutes the notion that low-resolution gene trees introduce phylogenetic noise. Further, our study also highlights the value of quantifying changes in nodal support across locus subsets of increasing size (but decreasing gene tree resolution). Such detailed analyses can reveal anomalous fluctuations in support at some nodes, suggesting the possibility of model violation. By characterizing the heterogeneity in phylogenetic signal among loci, we can account for uncertainty in gene tree estimation and assess its effect on the consistency of the species tree estimate. We suggest that the evaluation of gene tree resolution should be incorporated in the analysis of empirical phylogenomic datasets. This will ultimately increase our confidence in species tree estimation using summary-coalescent methods and enable us to exploit genomic data for phylogenetic inference.

准确的基因树(gene tree)推断是总结式溯祖框架(summary-coalescent framework)下物种树(species tree)估计的重要环节。然而在实证研究中,由于靶向位点间系统发育信号的随机变异,所推断的基因树准确性存在差异。因此,实证研究者应当在考量系统基因组数据集(phylogenomic datasets)所呈现的基因树分辨率异质性的同时,检验物种树推断的一致性。本研究针对基因树推断误差对总结式溯祖物种树推断的影响展开评估,具体操作是在系统发育推断前,基于基因树分辨率筛选约2000个外显子位点。我们聚焦于一组系统发育学上极具挑战性的澳大利亚蜥蜴辐射演化类群(石龙子科*Scincidae*,睑虎蜥属*Cryptoblepharus*),并探究其对拓扑结构与节点支持度的影响。我们基于全部位点得到了具有高支持度的拓扑结构,并发现数量相对较少的高分辨率基因树即可收敛至同一拓扑结构。随着分辨率逐步降低的基因树被纳入分析,所得拓扑结构整体保持一致,同时提升了特定二分分支(bipartitions)的支持度——这类分支在仅使用少量信息位点时支持度较低。这一结果佐证了基于溯祖的模拟研究结论:即需要大量位点才能可靠解决疑难类群间的系统发育关系,同时驳斥了"低分辨率基因树会引入系统发育噪声"的观点。此外,本研究还凸显了对位点数量递增(但基因树分辨率递减)的位点子集的节点支持度变化进行量化的价值。这类精细化分析可揭示部分节点支持度的异常波动,提示存在模型违反的可能性。通过刻画位点间系统发育信号的异质性,我们可以考量基因树推断中的不确定性,并评估其对物种树估计一致性的影响。我们建议,在实证系统基因组数据集的分析中,应当纳入基因树分辨率的评估环节。这一举措最终将提升我们使用总结式溯祖方法进行物种树估计的可信度,并助力我们利用基因组数据开展系统发育推断。
创建时间:
2016-09-28
二维码
社区交流群
二维码
科研交流群
商业服务