five

molxspec: Deep learning models for predicting MS2 spectra from molecular structures

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/5717414
下载链接
链接失效反馈
官方服务:
资源简介:
This repository contains a pre-processed dataset derived from the GNPS public repository of natural product mass spectra as well as pretrained model weights for four different types of model architectures using pytorch (version 1.9.0). The contents are as follows: gnps_processed_data.tgz: Contains tab separated files of molecule/MS2 spectra pairs derived from GNPS after filtering for invalid structures, too large molecules (bigger than 2000 M/Z spectra), and structures that yielded valid 3D geometry optimization. The processing steps were done for positive ionization mode (pos_* files), though negative ionization data is also included (neg_* files) models.tgz: Contains pytorch format pretrained models for four different architecutres: MLP (a residual block multilayer perceptron trained on ECFP molecular fingerprints), BERT (the same MLP but trained on pretrained representations from the Zinc V1 pretrained ChemBERTa models on SMILES), GCN (a graph convolution architecture), and EGNN (an equivariant graph neural network). Models were trained on pos_processed_gnps_shuffled_with_3d_train.tsv found in the gnps_processed_data.tgz file described previously.
创建时间:
2021-11-22
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作