five

Protein Inference Using Peptide Quantification Patterns

收藏
NIAID Data Ecosystem2026-03-08 收录
下载链接:
https://figshare.com/articles/dataset/Protein_Inference_Using_Peptide_Quantification_Patterns/2278048
下载链接
链接失效反馈
官方服务:
资源简介:
Determining the list of proteins present in a sample, based on the list of identified peptides, is a crucial step in the untargeted proteomics LC–MS/MS data-processing pipeline. This step, commonly referred to as protein inference, turns out to be a very challenging problem because many peptide sequences are found across multiple proteins. Current protein inference engines typically use peptide to spectrum match (PSM) quality measures and spectral count information to score protein identifications in LC–MS/MS data sets. This is, however, not enough to confidently validate or otherwise rule out many of the proteins. Here we introduce the basis for a new way of performing protein inference based on accurate quantification patterns of identified peptides using the correlation of these patterns to validate peptide to protein matches. For the first implementation of this new approach, we focused on (1) distinguishing between unambiguously and ambiguously identified proteins and (2) generating hypotheses for the discrimination of subsets of the ambiguously identified proteins. Our preprocessing pipelines support both labeled LC–MS/MS or label-free LC–MS followed by LC–MS/MS providing the peptide quantification. We apply our procedure to two published data sets and show that it is able to detect and infer proteins that would otherwise not be confidently inferred.

基于已鉴定肽段列表确定样本中所含蛋白质的清单,是非靶向蛋白质组学液相色谱-串联质谱(LC–MS/MS)数据处理流程中的关键步骤。该步骤通常被称为蛋白质推断(protein inference),却因诸多肽段序列可匹配多个蛋白质而成为极具挑战性的难题。当前的蛋白质推断引擎通常采用肽段-谱图匹配(peptide to spectrum match, PSM)质量评估指标与谱图计数信息,对LC–MS/MS数据集内的蛋白质鉴定结果进行评分。然而,仅依靠此类方法尚不足以可靠验证或排除诸多候选蛋白质。本研究提出一种全新的蛋白质推断方法框架,其基于已鉴定肽段的精准定量特征,通过特征相关性验证肽段与蛋白质的匹配关系。针对该新方法的首个实现版本,我们聚焦于两项任务:(1) 区分明确鉴定与模糊鉴定的蛋白质;(2) 为鉴别模糊鉴定蛋白质的子集构建假说。我们的预处理流程可兼容标记定量LC–MS/MS与无标记LC–MS后接LC–MS/MS两种实验方案,以实现肽段定量。我们将所提方法应用于两套已公开的数据集,结果表明其可检测并推断出那些原本无法被可靠鉴定的蛋白质。
创建时间:
2016-02-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作