Accurate, scalable, and fully automated inference of species trees from raw genome assemblies using ROADIES
收藏DataCite Commons2026-03-04 更新2026-04-25 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.tht76hf73
下载链接
链接失效反馈官方服务:
资源简介:
Current genome sequencing initiatives across a wide range of life forms
offer significant potential to enhance our understanding of evolutionary
relationships and support transformative biological and medical
applications. Species trees play a central role in many of these
applications; however, despite the widespread availability of genome
assemblies, accurate inference of species trees remains challenging for
many scientists due to the limited automation, significant domain
expertise, and substantial computational resources required by
conventional methods. To address this limitation, we present ROADIES, a
fully-automated pipeline to infer species trees starting from raw genome
assemblies (those lacking prior annotations). In contrast to the prominent
approach, ROADIES randomly selects segments of the input genomes to
generate gene trees. This eliminates the need to choose any single
reference species or perform the cumbersome steps of gene annotations and
whole genome alignments. ROADIES also leverages existing discordance-aware
methods that allow multi-copy genes, eliminating the need to infer
orthology. Using the genomic datasets from large-scale sequencing efforts
across four diverse life forms (placental mammals, pomace flies, birds,
and budding yeasts), we show that ROADIES infers species trees that are
comparable in quality with the state-of-the-art studies that involved
domain experts but in a fraction of the time and effort. With its speed,
accuracy, and automation, ROADIES has the potential to vastly simplify
species tree inference, making it accessible to a broader range of
scientists and applications.
提供机构:
Dryad
创建时间:
2024-07-20



