five

Examining Troughs in the Mass Distribution of All Theoretically Possible Tryptic Peptides

收藏
NIAID Data Ecosystem2026-03-07 收录
下载链接:
https://figshare.com/articles/dataset/Examining_Troughs_in_the_Mass_Distribution_of_All_Theoretically_Possible_Tryptic_Peptides/2618941
下载链接
链接失效反馈
官方服务:
资源简介:
This work describes the mass distribution of all theoretically possibly tryptic peptides made of 20 amino acids, up to the mass of 3 kDa, with resolution of 0.001 Da. We characterize regions between the peaks of the distribution, including gaps (forbidden zones) and low-populated areas (quiet zones). We show how the gaps shrink over the mass range and when they completely disappear. We demonstrate that peptide compositions in quiet zones are less diverse than those in the peaks of the distribution and that by eliminating certain types of unrealistic compositions the gaps in the distribution may be increased. The mass distribution is generated using a parallel implementation of a recursive procedure that enumerates all amino acid compositions. It allows us to enumerate all compositions of tryptic peptides below 3 kDa in 48 min using a computer cluster with 12 Intel Xeon X5650 CPUs (72 cores). The results of this work can be used to facilitate protein identification and mass defect labeling in mass spectrometry-based proteomics experiments.

本研究构建了由20种天然氨基酸构成的、理论上所有可能的胰蛋白酶肽(tryptic peptides)的质量分布,质量覆盖范围上限为3千道尔顿(kDa),质量分辨率达0.001道尔顿(Da)。本研究对该质量分布峰间区域进行了系统表征,涵盖间隙区(forbidden zones)与低丰度区(quiet zones)。研究阐明了间隙区随质量范围变化的收缩规律,以及其完全消失的临界质量点。本研究证实,静默区内的肽组成多样性低于分布峰区;同时发现,通过剔除部分非合理的肽组成类型,可进一步扩大分布中的间隙区。本研究采用并行化递归枚举算法遍历所有氨基酸组成,进而生成上述质量分布。依托搭载12颗Intel Xeon X5650处理器(共72核)的计算机集群,本研究可在48分钟内枚举完成质量低于3 kDa的所有胰蛋白酶肽组成。本研究成果可用于辅助基于质谱(mass spectrometry)的蛋白质组学(proteomics)实验中的蛋白质鉴定与质量缺陷标记。
创建时间:
2016-02-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作