Next Generation Proteomic Pipeline for Chromosome-Based Proteomic Research Using NeXtProt and GENCODE Databases
收藏NIAID Data Ecosystem2026-03-10 收录
下载链接:
https://figshare.com/articles/dataset/Next_Generation_Proteomic_Pipeline_for_Chromosome-Based_Proteomic_Research_Using_NeXtProt_and_GENCODE_Databases/5498185
下载链接
链接失效反馈官方服务:
资源简介:
Human Proteome Project
aims to map all human proteins including
missing proteins as well as proteoforms with post translational modifications,
alternative splicing variants (ASVs), and single amino acid variants
(SAAVs). neXtProt and Ensemble databases are usually used to provide
curated information on human coding genes. However, to find these
proteoforms, we (Chr #11 team) first introduce a streamlined pipeline
using customized and concatenated neXtProt and GENCODE originated
from Ensemble, with controlled false discovery rate (FDR). Because
of large sized databases used in this pipeline, we found more stringent
FDR filtering (0.1% at the peptide level and 1% at the protein level)
to claim novel findings, such as GENCODE ASVs and missing proteins,
from human hippocampus data set (MSV000081385) and ProteomeXchange
(PXD007166). Using our next generation proteomic pipeline (nextPP)
with neXtProt and GENCODE databases, two missing proteins such as
activity-regulated cytoskeleton-associated protein (ARC, Chr 8) and
glutamate receptor ionotropic, kainite 5 (GRIK5, Chr 19) were additionally
identified with two or more unique peptides from human brain tissues.
Additionally, by applying the pipeline to human brain related data
sets such as cortex (PXD000067 and PXD000561), spinal cord, and fetal
brain (PXD000561), seven GENCODE ASVs such as ACTN4–012 (Chr.19),
DPYSL2–005 (Chr.8), MPRIP-003 (Chr.17), NCAM1–013 (Chr.11),
EPB41L1–017 (Chr.20), AGAP1–004 (Chr.2), and CPNE5–005
(Chr.6) were identified from two or more data sets. The identified
peptides of GENCODE ASVs were mapped onto novel exon insertions, alternative
translations at 5′-untranslated region, or novel protein coding
sequence. Applying the pipeline to male reproductive organ related
data sets, 52 GENCODE ASVs were identified from two testis (PXD000561
and PXD002179) and a spermatozoa (PXD003947) data sets. Four out of
52 GENCODE ASVs such as RAB11FIP5–008 (Chr. 2), RP13–347D8.7–001
(Chr. X), PRDX4–002 (Chr. X), and RP11–666A8.13–001
(Chr. 17) were identified in all of the three samples.
创建时间:
2017-10-13



