ProteomeGenerator: A Framework for Comprehensive Proteomics Based on de Novo Transcriptome Assembly and High-Accuracy Peptide Mass Spectral Matching
收藏NIAID Data Ecosystem2026-03-10 收录
下载链接:
https://figshare.com/articles/dataset/ProteomeGenerator_A_Framework_for_Comprehensive_Proteomics_Based_on_de_Novo_Transcriptome_Assembly_and_High-Accuracy_Peptide_Mass_Spectral_Matching/7227839
下载链接
链接失效反馈官方服务:
资源简介:
Modern mass spectrometry now permits
genome-scale and quantitative
measurements of biological proteomes. However, analysis of specific
specimens is currently hindered by the incomplete representation of
biological variability of protein sequences in canonical reference
proteomes and the technical demands for their construction. Here,
we report ProteomeGenerator, a framework for de novo and reference-assisted
proteogenomic database construction and analysis based on sample-specific
transcriptome sequencing and high-accuracy mass spectrometry proteomics.
This enables the assembly of proteomes encoded by actively transcribed
genes, including sample-specific protein isoforms resulting from non-canonical
mRNA transcription, splicing, or editing. To improve the accuracy
of protein isoform identification in non-canonical proteomes, ProteomeGenerator
relies on statistical target–decoy database matching calibrated
using sample-specific controls. Its current implementation includes
automatic integration with MaxQuant mass spectrometry proteomics algorithms.
We applied this method for the proteogenomic analysis of splicing
factor SRSF2 mutant leukemia cells, demonstrating
high-confidence identification of non-canonical protein isoforms arising
from alternative transcriptional start sites, intron retention, and
cryptic exon splicing as well as improved accuracy of genome-scale
proteome discovery. Additionally, we report proteogenomic performance
metrics for current state-of-the-art implementations of SEQUEST HT,
MaxQuant, Byonic, and PEAKS mass spectral analysis algorithms. Finally,
ProteomeGenerator is implemented as a Snakemake workflow within a
Singularity container for one-step installation in diverse computing
environments, thereby enabling open, scalable, and facile discovery
of sample-specific, non-canonical, and neomorphic biological proteomes.
创建时间:
2018-10-19



