five

UnFATE: A comprehensive probe set and bioinformatics pipeline for phylogeny reconstruction and multilocus barcoding of filamentous ascomycetes (Ascomycota, Pezizomycotina)

收藏
DataCite Commons2025-05-01 更新2025-04-10 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.tht76hf1x
下载链接
链接失效反馈
官方服务:
资源简介:
The subphylum Pezizomycotina (filamentous ascomycetes) is the largest clade within Ascomycota. Despite the importance of this group of fungi, our understanding of their evolution is still limited due to insufficient taxon sampling. Although next-generation sequencing technology allows us to obtain complete genomes for phylogenetic analyses, generating complete genomes of fungal species can be challenging, especially when fungi occur in symbiotic relationships or when the DNA of rare herbarium specimens is degraded or contaminated. Additionally, assembly, annotation, and gene extraction of whole-genome sequencing data require bioinformatics skills and computational power, resulting in a substantial data burden. To overcome these obstacles, we designed a universal target enrichment probe set to reconstruct the phylogenetic relationships of filamentous ascomycetes at different phylogenetic levels. From a pool of single-copy orthologous genes extracted from available Pezizomycotina genomes, we identified the smallest subset of genetic markers that can reliably reconstruct a robust phylogeny. We used a clustering approach to identify a sequence set that could provide an optimal trade-off between potential missing data and probe set cost. We incorporated this probe set into a user-friendly wrapper script named UnFATE (https://github.com/claudioametrano/UnFATE) that allows phylogenomic inferences without requiring expert bioinformatics knowledge. In addition to phylogenetic results, the software provides a powerful multilocus alternative to ITS-based barcoding. Phylogeny and barcoding approaches can be complemented by an integrated, pre-processed, and periodically updated database of all publicly available Pezizomycotina genomes. The UnFATE pipeline, using the 195 selected marker genes, consistently performed well across various phylogenetic depths, generating trees consistent with the reference phylogenomic inferences. The topological distance between the reference trees from literature and the best tree produced by UnFATE ranged between 0.10 and 0.14 (nRF) for phylogenies from family to subphylum level. We also tested the in vitro success of the universal baits set in a target capture approach on 25 herbarium specimens from ten representative classes in Pezizomycotina, which recovered a topology mostly congruent with recent phylogenomic inferences for this group of fungi. The discriminating power of our gene set was also assessed by the multilocus barcoding approach, which outperformed the barcoding approach based on ITS. With these tools, we aim to provide a framework for a collaborative approach to build robust, conclusive phylogenies of this important fungal clade.

盘菌亚门(Pezizomycotina,即丝状子囊菌)是子囊菌门(Ascomycota)中最大的演化支。尽管该类真菌具有重要研究价值,但受限于类群采样不足,学界对其演化历程的认知仍较为有限。尽管下一代测序(next-generation sequencing)技术已可获取用于系统发育分析的完整基因组,但获取真菌物种的完整基因组仍颇具挑战——尤其是当真菌处于共生关系中,或是稀有标本馆标本的DNA发生降解、污染时。此外,全基因组测序数据的组装、注释与基因提取需要生物信息学技能与计算资源,带来了沉重的数据负担。为克服上述障碍,我们设计了一套通用靶向富集探针组,用于在不同系统发育层级重建丝状子囊菌的系统发育关系。我们从已公开的盘菌亚门基因组中提取单拷贝直系同源基因(single-copy orthologous genes),从中筛选出可可靠重建稳健系统发育树的最小遗传标记子集。通过聚类方法确定了一套序列集,可在潜在缺失数据与探针组成本之间实现最优权衡。我们将该探针集整合进一款名为UnFATE的易用包装脚本(https://github.com/claudioametrano/UnFATE),使用者无需具备专业生物信息学知识即可开展系统发育组学推断。除系统发育分析结果外,该软件还提供了一种高效的多位点替代方案,用于替代基于内转录间隔区(Internal Transcribed Spacer, ITS)的条形码鉴定。系统发育与条形码鉴定方法可依托一套整合、预处理且定期更新的公共盘菌亚门基因组数据库进行互补分析。基于195个筛选出的标记基因的UnFATE流程,在各类系统发育深度下均表现稳定,生成的系统发育树与参考系统发育组学推断结果高度一致。针对科至亚门层级的系统发育分析,文献中的参考树与UnFATE生成的最优树之间的拓扑距离(normalized Robinson-Foulds distance, nRF)介于0.10至0.14之间。我们还通过靶向捕获实验,对盘菌亚门10个代表性纲的25份标本馆标本开展了通用诱饵组的体外有效性测试,所得拓扑结构与该类真菌近期的系统发育组学推断结果基本一致。我们的基因集的区分能力也通过多位点条形码鉴定方法得到了验证,该方法的表现优于基于ITS的条形码鉴定方案。借助这些工具,我们旨在构建一套协作框架,以生成该重要真菌演化支的稳健、确定性系统发育关系。
提供机构:
Dryad
创建时间:
2025-01-23
二维码
社区交流群
二维码
科研交流群
商业服务