MealTime-MS: A Machine Learning-Guided Real-Time Mass Spectrometry Analysis for Protein Identification and Efficient Dynamic Exclusion
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://figshare.com/articles/dataset/MealTime-MS_A_Machine_Learning-Guided_Real-Time_Mass_Spectrometry_Analysis_for_Protein_Identification_and_Efficient_Dynamic_Exclusion/12499892
下载链接
链接失效反馈官方服务:
资源简介:
Mass
spectrometry-based proteomics technologies are prime methods
for the high-throughput identification of proteins in complex biological
samples. Nevertheless, there are still technical limitations that
hinder the ability of mass spectrometry to identify low abundance
proteins in complex samples. Characterizing such proteins is essential
to provide a comprehensive understanding of the biological processes
taking place in cells and tissues. Still today, most mass spectrometry-based
proteomics approaches use a data-dependent acquisition strategy, which
favors the collection of mass spectra from proteins of higher abundance.
Since the computational identification of proteins from proteomics
data is typically performed after mass spectrometry analysis, large
numbers of mass spectra are typically redundantly acquired from the
same abundant proteins, and little to no mass spectra are acquired
for proteins of lower abundance. We therefore propose a novel supervised
learning algorithm, MealTime-MS, that identifies proteins in real-time
as mass spectrometry data are acquired and prevents further data collection
from confidently identified proteins to ultimately free mass spectrometry
resources to improve the identification sensitivity of low abundance
proteins. We use real-time simulations of a previously performed mass
spectrometry analysis of a HEK293 cell lysate to show that our approach
can identify 92.1% of the proteins detected in the experiment using
66.2% of the MS2 spectra. We also demonstrate that our approach outperforms
a previously proposed method, is sufficiently fast for real-time mass
spectrometry analysis, and is flexible. Finally, MealTime-MS’
efficient usage of mass spectrometry resources will provide a more
comprehensive characterization of proteomes in complex samples.
基于质谱(Mass Spectrometry, MS)的蛋白质组学(proteomics)技术,是实现复杂生物样本中蛋白质高通量鉴定的核心方法。然而,质谱技术仍存在技术局限,制约了其对复杂样本中低丰度蛋白质的鉴定能力。对这类蛋白质进行表征,对于全面解析细胞与组织内发生的生物学过程至关重要。
时至今日,绝大多数基于质谱的蛋白质组学研究仍采用数据依赖采集(Data-Dependent Acquisition, DDA)策略,该策略优先采集高丰度蛋白质的质谱谱图。由于蛋白质组学数据的计算鉴定通常在质谱分析完成后进行,因此往往会从同一批高丰度蛋白质中重复采集大量质谱谱图,而低丰度蛋白质则几乎无法获得质谱谱图。
为此,我们提出一种新型监督学习算法MealTime-MS,该算法可在质谱数据采集的同时实时完成蛋白质鉴定,并阻止对已可靠鉴定的蛋白质进行后续数据采集,从而释放质谱资源以提升低丰度蛋白质的鉴定灵敏度。
我们利用已发表的HEK293细胞裂解液(HEK293 cell lysate)质谱分析数据进行实时模拟,结果表明,本方法仅使用66.2%的二级质谱谱图(MS2 spectra),即可鉴定出实验中检测到的92.1%的蛋白质。
我们还证实,本方法优于此前提出的同类方案,且具备足够的实时质谱分析运算速度与灵活性。最后,MealTime-MS对质谱资源的高效利用,将为复杂样本的蛋白质组学表征提供更全面的研究手段。
创建时间:
2020-06-08



