Machine Learning of MS dataset
收藏Mendeley Data2024-01-31 更新2024-06-26 收录
下载链接:
https://data.mendeley.com/datasets/2rcc8488hx
下载链接
链接失效反馈官方服务:
资源简介:
Mass spectrometric profiling provides information on the protein and metabolic composition of biological samples. However, the weak efficiency of computational algorithms in correlation between tandem spectra to molecular components (proteins and metabolites) dramatically limits the use of "omics" data for the classification of nosologies. The development of machine learning methods for the intelligent analysis of raw mass spectrometric HPLC-MS/MS measurements without data preprocessing and identification seems promising. In this study, we tested the ap-plication of neural networks of two types, 1D-Residual CNN and 3D-CNN, to the combined metabolomic and proteomic HPLC-MS/MS data for the classification of three cancer phenotypes. Both neural networks are capable of classifying as gender-mixed oncological phenotypes (kidney cancer) as gender-specific phenotypes (ovarian cancer) and recognize healthy condition accuracy of 0.95 by analyzing ‘omics’ data in the ‘mgf’ data format. The neural network makes possible to determine their similarity degree (distance matrix) between submitted phenotypes, thus over-coming algorithmic barriers in identifying HPLC-MS/MS spectra. The closest distance was shown between ovarian cancer, kidney cancer, and prostate cancer/kidney cancer, whereas the healthy phenotype was the most outer from cancer phenotypes. Neural networks are versatile and can be applied to standard experimental data formats of different analytical platforms.
创建时间:
2024-01-31



