TaxIt: An Iterative Computational Pipeline for Untargeted Strain-Level Identification Using MS/MS Spectra from Pathogenic Single-Organism Samples
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://figshare.com/articles/dataset/TaxIt_An_Iterative_Computational_Pipeline_for_Untargeted_Strain-Level_Identification_Using_MS_MS_Spectra_from_Pathogenic_Single-Organism_Samples/12312146
下载链接
链接失效反馈官方服务:
资源简介:
Untargeted accurate strain-level
classification of a priori unidentified
organisms using tandem mass spectrometry is a challenging task. Reference
databases often lack taxonomic depth, limiting peptide assignments
to the species level. However, the extension with detailed strain
information increases runtime and decreases statistical power. In
addition, larger databases contain a higher number of similar proteomes.
We present TaxIt, an iterative workflow to address the increasing
search space required for MS/MS-based strain-level classification
of samples with unknown taxonomic origin. TaxIt first applies reference
sequence data for initial identification of species candidates, followed
by automated acquisition of relevant strain sequences for low level
classification. Furthermore, proteome similarities resulting in ambiguous
taxonomic assignments are addressed with an abundance weighting strategy
to increase the confidence in candidate taxa. For benchmarking the
performance of our method, we apply our iterative workflow on several
samples of bacterial and viral origin. In comparison to noniterative
approaches using unique peptides or advanced abundance correction,
TaxIt identifies microbial strains correctly in all examples presented
(with one tie), thereby demonstrating the potential for untargeted
and deeper taxonomic classification. TaxIt makes extensive use of
public, unrestricted, and continuously growing sequence resources
such as the NCBI databases and is available under open-source BSD
license at https://gitlab.com/rki_bioinformatics/TaxIt.
创建时间:
2020-05-04



