five

MealTime-MS: A Machine Learning-Guided Real-Time Mass Spectrometry Analysis for Protein Identification and Efficient Dynamic Exclusion

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://figshare.com/articles/dataset/MealTime-MS_A_Machine_Learning-Guided_Real-Time_Mass_Spectrometry_Analysis_for_Protein_Identification_and_Efficient_Dynamic_Exclusion/12499877
下载链接
链接失效反馈
官方服务:
资源简介:
Mass spectrometry-based proteomics technologies are prime methods for the high-throughput identification of proteins in complex biological samples. Nevertheless, there are still technical limitations that hinder the ability of mass spectrometry to identify low abundance proteins in complex samples. Characterizing such proteins is essential to provide a comprehensive understanding of the biological processes taking place in cells and tissues. Still today, most mass spectrometry-based proteomics approaches use a data-dependent acquisition strategy, which favors the collection of mass spectra from proteins of higher abundance. Since the computational identification of proteins from proteomics data is typically performed after mass spectrometry analysis, large numbers of mass spectra are typically redundantly acquired from the same abundant proteins, and little to no mass spectra are acquired for proteins of lower abundance. We therefore propose a novel supervised learning algorithm, MealTime-MS, that identifies proteins in real-time as mass spectrometry data are acquired and prevents further data collection from confidently identified proteins to ultimately free mass spectrometry resources to improve the identification sensitivity of low abundance proteins. We use real-time simulations of a previously performed mass spectrometry analysis of a HEK293 cell lysate to show that our approach can identify 92.1% of the proteins detected in the experiment using 66.2% of the MS2 spectra. We also demonstrate that our approach outperforms a previously proposed method, is sufficiently fast for real-time mass spectrometry analysis, and is flexible. Finally, MealTime-MS’ efficient usage of mass spectrometry resources will provide a more comprehensive characterization of proteomes in complex samples.

基于质谱(Mass Spectrometry)的蛋白质组学技术,是实现复杂生物样本中蛋白质高通量鉴定的核心手段。然而,当前质谱技术仍存在技术局限,制约了其对复杂样本中低丰度蛋白质的鉴定能力。对这类蛋白质进行表征,对于全面解析细胞与组织内的生物学过程至关重要。 时至今日,多数基于质谱的蛋白质组学研究仍采用数据依赖性采集(Data-Dependent Acquisition, DDA)策略,该策略优先采集高丰度蛋白质的质谱信号。由于蛋白质组学数据的蛋白质计算鉴定通常在质谱分析完成后进行,因此往往会对同一批高丰度蛋白质重复采集大量质谱信号,而低丰度蛋白质则几乎无法获得质谱采集机会。 为此,我们提出一种新型监督学习算法MealTime-MS,其可在质谱数据采集过程中实时完成蛋白质鉴定,并阻止对已可靠鉴定的蛋白质继续采集数据,从而释放质谱资源,以提升低丰度蛋白质的鉴定灵敏度。 我们通过对已发表的HEK293细胞裂解液质谱分析数据进行实时模拟,证明本方法仅使用66.2%的二级质谱(MS2)信号,即可鉴定出实验中检测到的92.1%的蛋白质。 同时,我们验证了本方法优于此前提出的同类方案,且具备足够的实时质谱分析处理速度与灵活性。 最终,MealTime-MS对质谱资源的高效利用,将助力实现复杂样本中蛋白质组的更全面表征。
创建时间:
2020-06-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作