Signal Partitioning Algorithm for Highly Efficient Gaussian Mixture Modeling in Mass Spectrometry
收藏NIAID Data Ecosystem2026-03-08 收录
下载链接:
https://figshare.com/articles/dataset/_Signal_Partitioning_Algorithm_for_Highly_Efficient_Gaussian_Mixture_Modeling_in_Mass_Spectrometry_/1499355
下载链接
链接失效反馈官方服务:
资源简介:
Mixture - modeling of mass spectra is an approach with many potential applications including peak detection and quantification, smoothing, de-noising, feature extraction and spectral signal compression. However, existing algorithms do not allow for automated analyses of whole spectra. Therefore, despite highlighting potential advantages of mixture modeling of mass spectra of peptide/protein mixtures and some preliminary results presented in several papers, the mixture modeling approach was so far not developed to the stage enabling systematic comparisons with existing software packages for proteomic mass spectra analyses. In this paper we present an efficient algorithm for Gaussian mixture modeling of proteomic mass spectra of different types (e.g., MALDI-ToF profiling, MALDI-IMS). The main idea is automated partitioning of protein mass spectral signal into fragments. The obtained fragments are separately decomposed into Gaussian mixture models. The parameters of the mixture models of fragments are then aggregated to form the mixture model of the whole spectrum. We compare the elaborated algorithm to existing algorithms for peak detection and we demonstrate improvements of peak detection efficiency obtained by using Gaussian mixture modeling. We also show applications of the elaborated algorithm to real proteomic datasets of low and high resolution.
质谱混合建模(mixture modeling of mass spectra)是一类具有诸多潜在应用价值的方法,可应用于峰检测(peak detection)与定量(quantification)、平滑(smoothing)、去噪(de-noising)、特征提取(feature extraction)以及光谱信号压缩(spectral signal compression)等场景。然而,现有算法尚无法实现全谱的自动化分析。因此,尽管已有多篇文献指出肽/蛋白质混合物(peptide/protein mixtures)的质谱混合建模具有潜在优势,并给出了部分初步研究结果,但截至目前,该混合建模方法尚未发展至可与现有蛋白质组质谱(proteomic mass spectra)分析软件包进行系统性对比的阶段。本文提出一种适用于多种类型蛋白质组质谱的高效高斯混合建模(Gaussian mixture modeling)算法,例如基质辅助激光解吸电离飞行时间谱表征(MALDI-ToF profiling)、MALDI成像质谱(MALDI-IMS)。该算法的核心思路是将蛋白质质谱信号自动划分为若干片段,随后将每个片段分别拆解为高斯混合模型,再将各片段的混合模型参数进行聚合,最终得到全谱的混合模型。我们将所提出的算法与现有峰检测算法进行对比,验证了采用高斯混合建模可有效提升峰检测效率。此外,我们还将该算法应用于低分辨率与高分辨率的真实蛋白质组数据集(proteomic datasets),并展示了其实际应用效果。
创建时间:
2015-07-31



