Large-Scale Reanalysis of Publicly Available HeLa Cell Proteomics Data in the Context of the Human Proteome Project
收藏NIAID Data Ecosystem2026-03-10 收录
下载链接:
https://figshare.com/articles/dataset/Large-Scale_Reanalysis_of_Publicly_Available_HeLa_Cell_Proteomics_Data_in_the_Context_of_the_Human_Proteome_Project/7096091
下载链接
链接失效反馈官方服务:
资源简介:
The practice of data sharing in the
proteomics field took off and quickly spread in recent years as a
result of collective effort. Nowadays, most journal editors mandate
the submission of the original raw mass spectra to one of the databases
of the ProteomeXchange consortium. With the exception of large institutional
initiatives such as PeptideAtlas or the GPMDB, few new studies are
however based on the reanalysis of mass spectrometry data. A wealth
of information is thus left unexploited in public databases and repositories.
Here, we present the large-scale reanalysis of 41 publicly available
data sets corresponding to experiments carried out on the HeLa cancer
cell line using a custom workflow. In addition to the search of new
post-translational modification sites and “missing proteins”,
our main goal is to identify single amino acid variants and evaluate
their impact on protein expression and stability through the spectral
counting quantification approach. The X!Tandem software was selected
to perform the search of a total of 56 363 701 tandem
mass spectra against a customized variant protein database, compiled
by the application of the in-house MzVar tool on HeLa-specific somatic
and genomic variants retrieved from the COSMIC cell line project.
After filtering the resulting identifications with a 1% FDR threshold
computed at the protein level, 49 466 unique peptides were
identified in 7266 protein entries, allowing the validation of 5576
protein entries in accordance with the HPP guidelines version 2.1.
A new “missing protein” was observed (FRAT2, NX_O75474,
chromosome 10), and 189 new phosphorylation and 392 new protein N-terminal
acetylation sites could be identified. Twenty-four variant peptides
were also identified, corresponding to 21 variants in 21 proteins.
For three of the nine heterozygous cases where both the variant peptide
and its wild-type counterpart were detected, the application of a
two-tailed sign test showed a significant difference in the abundance
of the two peptide versions.
创建时间:
2018-09-17



