five

Constraint-tree methods resolve problematic branches in a recently diverged genus of New World swallow

收藏
NIAID Data Ecosystem2026-03-10 收录
下载链接:
https://www.ncbi.nlm.nih.gov/bioproject/PRJNA503768
下载链接
链接失效反馈
官方服务:
资源简介:
The tree swallows (Tachycineta) is a model group of birds, and knowledge of its phylogeny is essential to interpreting a myriad of ecological, physiological, and behavioral data. However, reconstructing the phylogeny of Tachycineta has proven extremely difficult, probably because the branching order of its nine species is complicated by incomplete lineage sorting and introgression. Given earlier failures to resolve the phylogeny using Sanger-sequence data, we applied three methods of phylogenetic inference to a dataset of thousands of ultraconserved element (UCE) loci sequenced for multiple individuals of each species. Unfortunately, we recovered three well-supported discordant topologies. This result is perhaps unsurprising. The increasing incidence of alternative phylogenomic datasets yielding conflicting answers to phylogenetic questions has generated substantial debate concerning the collection and analysis of such data, and this study falls into a particularly troubling area, in which analysis of the same data by different methods leads to conflicting trees. We resolved the conflict by applying two methods, gene genealogy interrogation and analysis of per-site likelihood differences, which compare species trees and gene trees constrained to represent a set of topological hypotheses. These methods have previously been used only to resolve higher level phylogenetic problems; we demonstrate here that they also have the potential to clarify relationships in much more recently diverged groups. The conflicts we observed in the Tachycineta tree were driven by a tiny proportion of sites in the dataset; excluding less than five percent of loci from the concatenated alignment, or one percent of sites from each locus, was enough to change the results of coalescent-based phylogenetic inferences. The sensitivity of phylogenomic reconstruction to small numbers of influential sites has been observed in other studies and emphasizes both the importance of broad genomic sampling as well as the need to investigate biological sources of discord. This study makes clear that in many cases phylogenetic inference from genome-scale data will be inconclusive and will require post-inference analyses to select the most plausible tree.

树燕属(Tachycineta)是鸟类研究的经典模式类群,对其系统发育关系的认知,是解读海量生态、生理与行为学数据的核心前提。然而,重构树燕属的系统发育树却被证实极具挑战性:其9个物种的分支顺序极可能因不完全谱系分选(incomplete lineage sorting)和基因渐渗(introgression)而变得异常复杂。此前有研究尝试利用桑格测序(Sanger-sequence)数据解析该类群的系统发育关系,但均以失败告终。为此,我们针对每个物种的多个个体测序得到的数千个超保守元件(ultraconserved element, UCE)位点数据集,采用三种系统发育推断方法开展分析。遗憾的是,我们得到了三个支持度极高但相互矛盾的拓扑结构。这一结果或许并不令人意外:近年来,越来越多的系统基因组数据集在系统发育问题上给出了相互冲突的结论,引发了学界关于此类数据收集与分析的广泛争论;而本研究恰好陷入了一个尤为棘手的领域——即针对同一数据集采用不同分析方法,会得到相互矛盾的系统发育树。我们通过两种方法解决了这一冲突:基因谱系探查法(gene genealogy interrogation)与每位点似然差异分析法(per-site likelihood differences),这两种方法通过比较物种树与受特定拓扑假设约束的基因树来化解系统发育冲突。此前,这两种方法仅被用于解决高阶系统发育问题;我们的研究证明,它们同样有潜力厘清分化时间较近的类群之间的系统发育关系。我们观察到的树燕属系统发育树冲突,仅由数据集中极小比例的位点所驱动:剔除联合比对中不足5%的位点,或是每个位点中不足1%的位点,便足以改变基于溯祖的系统发育推断(coalescent-based phylogenetic inferences)结果。系统基因组重建对少量具有影响力的位点的敏感性,已在其他研究中被证实,这既凸显了大范围基因组采样的重要性,也强调了需要深入探究系统发育冲突的生物学根源。本研究清晰表明,在诸多场景中,基于基因组规模数据的系统发育推断往往无法得到确定结论,需要通过推断后分析来筛选最合理的系统发育树。
创建时间:
2018-11-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作