SingleFrag
收藏Figshare2024-10-23 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/SingleFrag/27266787
下载链接
链接失效反馈官方服务:
资源简介:
The following item could be divided into 3 parts:Neural networks trained to predict the presence of a peak in a specific MZ localization in a tandem mass spectrum. In the name of every file, the MZ position is written in the following way (mz*100).ANN: Artificial Neural NetworksGNN: Graph Neural NetworksCOM: Networks that combine the predictive power of ANN and GNNMol2vecModel: Contains a Mol2vec model trained to obtain a 300-dimensional vector from molecule SMILES.modelData:AllMostFreqMolGeneral_rep_dades1.pickle: file containing the number of peaks that are contained in every MZ bin from tandem mass spectra in the training set.thresholdsANN.pickle: threshold per each of the most frequented 1,000 MZ positions in the training set. If a prediction using an ANN model for a specific position is higher or equal to this value (for its specific MZ position), means that a peak in that bin is predicted.thresholdsGNN.pickle: same as above but for the GNN models.thresholdsCOM.pickle: same as above but for the COM models.Every data file is stored in a .pickle format, using Python 3.8.19.
下述内容可分为3个部分:用于预测串联质谱中特定质荷比(m/z)位置处峰存在性的神经网络。所有文件的文件名中,质荷比位置均以(mz×100)的形式标注。
人工神经网络(Artificial Neural Networks,ANN):经典的人工神经网络模型;
图神经网络(Graph Neural Networks,GNN):基于图结构进行特征提取与预测的神经网络模型;
COM:融合人工神经网络与图神经网络预测能力的混合网络模型。
Mol2vec模型:包含经训练可从分子SMILES生成300维向量的Mol2vec模型。
相关数据文件说明如下:
AllMostFreqMolGeneral_rep_dades1.pickle:存储训练集串联质谱各质荷比区间内峰数量的文件;
thresholdsANN.pickle:存储训练集内1000个最常见质荷比位置各自对应的判定阈值。若针对某一质荷比位置的人工神经网络(ANN)模型预测值大于或等于该位置对应的阈值,则表示预测该区间存在峰;
thresholdsGNN.pickle:含义与thresholdsANN.pickle一致,仅适用于图神经网络(GNN)模型;
thresholdsCOM.pickle:含义与thresholdsANN.pickle一致,仅适用于COM混合模型。
所有数据文件均采用.pickle格式存储,使用Python 3.8.19环境生成。
创建时间:
2024-10-23



