Enhancing Open Modification Searches via a Combined Approach Facilitated by Ursgal

NIAID Data Ecosystem2026-03-12 收录

下载链接：

https://zenodo.org/record/4299357

下载链接

链接失效反馈

官方服务：

资源简介：

The identification of peptide sequences and their post-translational modifications (PTMs) is a crucial step in the analysis of bottom-up proteomics data. The recent development of open modification search (OMS) engines allows virtually all PTMs to be searched for. This not only increases the number of spectra that can be matched to peptides but also greatly advances the understanding of biological roles of PTMs through the identification, and thereby facilitated quantification, of peptidoforms (peptide sequences and their potential PTMs). While the benefits of combining results from multiple protein database search engines has been established previously, similar approaches for OMS results are missing so far. Here, we compare and combine results from three different OMS engines, demonstrating an increase in peptide spectrum matches of 8-18%. The unification of search results furthermore allows for the combined downstream processing of search results, including the mapping to potential PTMs. Finally, we test for the ability of OMS engines to identify glycosylated peptides. The implementation of these engines in the Python framework Ursgal facilitates the straightforward application of OMS with unified parameters and results files, thereby enabling yet unmatched high-throughput, large-scale data analysis. This dataset includes all relevant results files, databases, and scripts that correspond to the accompanying journal article. Specifically, the following files are deposited: Homo_sapiens_PXD004452_results.zip: result files from OMS and CS for the dataset PXD004452 Homo_sapiens_PXD013715_results.zip: result files from OMS and CS for the dataset PXD013715 Haloferax_volcanii_PXD021874_results.zip: result files from OMS and CS for the dataset PXD021874 Escherichia_coli_PXD000498_results.zip: result files from OMS and CS for the dataset PXD000498 databases.zip: target-decoy databases for Homo sapiens, Escherichia coli and Haloferax volcanii as well as a glycan database for Homo sapiens scripts.zip: example scripts for all relevant steps of the analysis mzml_files.zip: mzML files for all included datasets ursgal.zip: current version of Ursgal (0.6.7) that has been used to generate the results (for most recent versions see https://github.com/ursgal/ursgal)

创建时间：

2020-12-02

5,000+

优质数据集

54 个

任务类型

进入经典数据集