five

MealTime-MS: A Machine Learning-Guided Real-Time Mass Spectrometry Analysis for Protein Identification and Efficient Dynamic Exclusion

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://figshare.com/articles/dataset/MealTime-MS_A_Machine_Learning-Guided_Real-Time_Mass_Spectrometry_Analysis_for_Protein_Identification_and_Efficient_Dynamic_Exclusion/12499892
下载链接
链接失效反馈
官方服务:
资源简介:
Mass spectrometry-based proteomics technologies are prime methods for the high-throughput identification of proteins in complex biological samples. Nevertheless, there are still technical limitations that hinder the ability of mass spectrometry to identify low abundance proteins in complex samples. Characterizing such proteins is essential to provide a comprehensive understanding of the biological processes taking place in cells and tissues. Still today, most mass spectrometry-based proteomics approaches use a data-dependent acquisition strategy, which favors the collection of mass spectra from proteins of higher abundance. Since the computational identification of proteins from proteomics data is typically performed after mass spectrometry analysis, large numbers of mass spectra are typically redundantly acquired from the same abundant proteins, and little to no mass spectra are acquired for proteins of lower abundance. We therefore propose a novel supervised learning algorithm, MealTime-MS, that identifies proteins in real-time as mass spectrometry data are acquired and prevents further data collection from confidently identified proteins to ultimately free mass spectrometry resources to improve the identification sensitivity of low abundance proteins. We use real-time simulations of a previously performed mass spectrometry analysis of a HEK293 cell lysate to show that our approach can identify 92.1% of the proteins detected in the experiment using 66.2% of the MS2 spectra. We also demonstrate that our approach outperforms a previously proposed method, is sufficiently fast for real-time mass spectrometry analysis, and is flexible. Finally, MealTime-MS’ efficient usage of mass spectrometry resources will provide a more comprehensive characterization of proteomes in complex samples.

基于质谱(Mass Spectrometry, MS)的蛋白质组学(proteomics)技术,是实现复杂生物样本中蛋白质高通量鉴定的核心方法。然而,质谱技术仍存在技术局限,制约了其对复杂样本中低丰度蛋白质的鉴定能力。对这类蛋白质进行表征,对于全面解析细胞与组织内发生的生物学过程至关重要。 时至今日,绝大多数基于质谱的蛋白质组学研究仍采用数据依赖采集(Data-Dependent Acquisition, DDA)策略,该策略优先采集高丰度蛋白质的质谱谱图。由于蛋白质组学数据的计算鉴定通常在质谱分析完成后进行,因此往往会从同一批高丰度蛋白质中重复采集大量质谱谱图,而低丰度蛋白质则几乎无法获得质谱谱图。 为此,我们提出一种新型监督学习算法MealTime-MS,该算法可在质谱数据采集的同时实时完成蛋白质鉴定,并阻止对已可靠鉴定的蛋白质进行后续数据采集,从而释放质谱资源以提升低丰度蛋白质的鉴定灵敏度。 我们利用已发表的HEK293细胞裂解液(HEK293 cell lysate)质谱分析数据进行实时模拟,结果表明,本方法仅使用66.2%的二级质谱谱图(MS2 spectra),即可鉴定出实验中检测到的92.1%的蛋白质。 我们还证实,本方法优于此前提出的同类方案,且具备足够的实时质谱分析运算速度与灵活性。最后,MealTime-MS对质谱资源的高效利用,将为复杂样本的蛋白质组学表征提供更全面的研究手段。
创建时间:
2020-06-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作