five

Climbing out of the anomaly zone: Gene tree filtration from 9000 markers rescues treefrogs (Hylidae) from conflicting phylogenomic relationships

收藏
NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://www.ncbi.nlm.nih.gov/sra/SRP289397
下载链接
链接失效反馈
官方服务:
资源简介:
An emerging challenge in interpreting phylogenomic datasets is that different analytical approaches may produce conflicting phylogenetic results. The most common and intensely debated conflict is the trustworthiness between concatenation and multi-species coalescent (MSC) species tree methods. Concatenation is problematic because it can strongly support a topology different from the true species tree when Incomplete Lineage Sorting (ILS) results in elevated gene-tree discordance. In extreme cases of ILS, the most common gene tree can be discordant from the true species tree, which is a phenomenon known as the anomaly zone. Concatenation can strongly support the most common anomalous gene tree (AGTs) while MSC account for ILS to recover the correct topology. However, the MSC assume gene trees are error-free, but the impact of erroneous gene trees (EGTs) resulting from short alignment lengths, little phylogenetic informativeness, or the type of phylogenetic marker on causing the anomaly zone has not been assessed. To understand this, we formalize a new alternative paradigm termed the erroneous zone to distinguish species trees detected in the anomaly zone driven by EGTs from the true biological anomaly zone driven by AGTs resulting from ILS. To test these predictions, we develop FilterZone, an R package that calculates the anomaly zone and provides tools for filtering alignments and gene trees and assessing concordance factors support. We apply these methods to Hylidae treefrogs using an expansive dataset of 9000 markers composed of exons, introns, and UCEs. We provide strong support for the monophyly of most Hylidae subfamilies and the relationships among them; however, we found conflicting topologies between concatenation and MSC and detected the anomaly zone in all datasets. We next assess whether the hylid tree is in the erroneous zone by testing our prediction that EGTs resulting from non-biological properties of alignments produces discordant EGTs rather than AGTs from ILS. After filtering alignments and gene trees, we find that removing shorter, less informative alignments reconciled the conflict between concatenation and MSC with increased gene concordance and supported our prediction that the species tree was in the erroneous zone. Critical to other studies, these findings present a potentially widespread conflation between the two zones, where past studies that have detected the anomaly zone may be in the erroneous zone. Our comparison among marker types suggests this here, where UCEs and Introns were not able to escape the erroneous zone because they were the shortest alignments. We conclude that these are two distinct problems; however, the erroneous zone can be escaped, and the species tree corrected by filtering EGTs prior to species tree estimation. Unexpectedly, the cost of reconciliation was low: relatively few high-quality gene trees were necessary for reconciliation (20) suggesting that difficult relationships could be resolved with few high-quality markers. These results could significantly impact study design and marker selection where access to research funds is limited, and also suggests a lasting importance of GenBank data.
创建时间:
2023-03-25
二维码
社区交流群
二维码
科研交流群
商业服务