Evaluating the Possibility of Detecting Variants in Shotgun Proteomics via LeTE-Fusion Analysis Pipeline
收藏NIAID Data Ecosystem2026-03-10 收录
下载链接:
https://figshare.com/articles/dataset/Evaluating_the_Possibility_of_Detecting_Variants_in_Shotgun_Proteomics_via_LeTE-Fusion_Analysis_Pipeline/6995033
下载链接
链接失效反馈官方服务:
资源简介:
In
proteogenomic studies, many genome-annotated events, for example,
single amino acid variation (SAAV) and short INDEL, are often unobserved
in shotgun proteomics. Therefore, we propose an analysis pipeline
called LeTE-fusion (Le, peptide length; T, theoretical values; E,
experimental data) to first investigate whether peptides with certain
lengths are observed more often in mass spectrometry (MS)-based proteomics,
which may hinder peptide identification causing difficulty in detecting
genome-annotated events. By applying LeTE-fusion on different MS-based
proteome data sets, we found peptides within 7–20 amino acids
are more frequently identified, possibly attributed to MS-related
factors instead of proteases. We then further extended the usage of
LeTE-fusion on four variant-containing-sequence data sets (SAAV-only)
with various sample complexity up to the whole human proteome scale,
which yields theoretically ∼70% variants observable in an ideal
shotgun proteomics. However, only ∼40% of variants might be
detectable in real shotgun proteomic experiments when LeTE-fusion
utilizes the experimentally observed variant-site-containing wild-type
peptides in PeptideAtlas to estimate the expected observable coverage
of variants. Finally, we conducted a case study on HEK293 cell line
with variants reported at genomic level that were also identified
in shotgun proteomics to demonstrate the efficacy of LeTE-fusion on
estimating expected observable coverage of variants. To the best of
our knowledge, this is the first study to systematically investigate
the detection limits of genome-annotated events via shotgun proteomics
using such analysis pipeline.
创建时间:
2018-08-22



