five

Table_1_Helix Matrix Transformation Combined With Convolutional Neural Network Algorithm for Matrix-Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry-Based Bacterial Identification.docx

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/Table_1_Helix_Matrix_Transformation_Combined_With_Convolutional_Neural_Network_Algorithm_for_Matrix-Assisted_Laser_Desorption_Ionization-Time_of_Flight_Mass_Spectrometry-Based_Bacterial_Identification_docx/13226486
下载链接
链接失效反馈
官方服务:
资源简介:
Matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) analysis is a rapid and reliable method for bacterial identification. Classification algorithms, as a critical part of the MALDI-TOF MS analysis approach, have been developed using both traditional algorithms and machine learning algorithms. In this study, a method that combined helix matrix transformation with a convolutional neural network (CNN) algorithm was presented for bacterial identification. A total of 14 bacterial species including 58 strains were selected to create an in-house MALDI-TOF MS spectrum dataset. The 1D array-type MALDI-TOF MS spectrum data were transformed through a helix matrix transformation into matrix-type data, which was fitted during the CNN training. Through the parameter optimization, the threshold for binarization was set as 16 and the final size of a matrix-type data was set as 25 × 25 to obtain a clean dataset with a small size. A CNN model with three convolutional layers was well trained using the dataset to predict bacterial species. The filter sizes for the three convolutional layers were 4, 8, and 16. The kernel size was three and the activation function was the rectified linear unit (ReLU). A back propagation neural network (BPNN) model was created without helix matrix transformation and a convolution layer to demonstrate whether the helix matrix transformation combined with CNN algorithm works better. The areas under the receiver operating characteristic (ROC) curve of the CNN and BPNN models were 0.98 and 0.87, respectively. The accuracies of the CNN and BPNN models were 97.78 ± 0.08 and 86.50 ± 0.01, respectively, with a significant statistical difference (p < 0.001). The results suggested that helix matrix transformation combined with the CNN algorithm enabled the feature extraction of the bacterial MALDI-TOF MS spectrum, which might be a proposed solution to identify bacterial species.

基质辅助激光解吸电离飞行时间质谱(Matrix-assisted laser desorption ionization-time of flight mass spectrometry,MALDI-TOF MS)分析是一种用于细菌鉴定的快速可靠方法。作为该分析方法的核心环节,分类算法已通过传统算法与机器学习算法两类方式得以开发。本研究提出了一种将螺旋矩阵变换(helix matrix transformation)与卷积神经网络(convolutional neural network,CNN)算法相结合的细菌鉴定方法。本研究共选取包含58株菌株的14种细菌,构建了一套自建的MALDI-TOF MS质谱数据集。将一维阵列型MALDI-TOF MS质谱数据通过螺旋矩阵变换转换为矩阵型数据,该数据可用于CNN训练中的模型拟合环节。通过参数优化,将二值化阈值设为16,并将矩阵型数据的最终尺寸设定为25×25,以此获得尺寸小巧且干净规整的数据集。利用该数据集训练得到一款包含3个卷积层的CNN模型以实现细菌种类预测,该模型的三个卷积层滤波器尺寸分别为4、8和16,卷积核尺寸为3,激活函数采用整流线性单元(rectified linear unit,ReLU)。本研究同时构建了未采用螺旋矩阵变换与卷积层的反向传播神经网络(back propagation neural network,BPNN)模型,用于对比验证结合螺旋矩阵变换的CNN算法是否具备更优性能。CNN与BPNN模型的受试者工作特征(receiver operating characteristic,ROC)曲线下面积分别为0.98与0.87,两款模型的准确率分别为97.78±0.08与86.50±0.01,组间差异具有统计学显著性(p<0.001)。研究结果表明,结合螺旋矩阵变换的CNN算法可实现细菌MALDI-TOF MS质谱数据的特征提取,有望成为细菌种类鉴定的可行方案。
创建时间:
2020-11-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作