five

Public Data files for MassFormer

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/7762057
下载链接
链接失效反馈
官方服务:
资源简介:
Public data files for experiments in MassFormer. See the Github repository for instructions on how to use this data. Raw Data: casmi_2016.tgz - Critical Assessment of Small Molecule Identification 2016, used for model evaluation. casmi_2022.tgz - Critical Assessment of Small Molecule Identification 2022, used for model evaluation. mb_na_msms.msp.gz - MassBank of North America export of LC-MS/MS spectra, used for model evaluation. cid_smiles.tsv.gz - Mapping of CID to SMILES strings, obtained from PubChem. Processed Data: proc_casmi_2016.tgz - Processed spectrum and molecule data for the CASMI 2016 benchmark. proc_casmi_2022.tgz - Processed spectrum and molecule data for the CASMI 2022 benchmark. proc_nist20_outlier.tgz - Processed spectrum and molecule data for the NIST20 Outlier benchmark (formerly called pseudo-CASMI). proc_demo.tgz - Processed spectrum and molecule data for the demo (refer to code repository for more information). cfm.tgz - Predicted spectra for the Competitive Fragmentation Modelling (CFM) baseline. Model Checkpoints: demo.pkl - Checkpoint of a MassFormer model trained on MoNA data, for the purposes of running the demo. checkpoint_best_pcqm4mv2.pt - Checkpoint of a Graphormer model pretrained on the PCQM4M dataset, used for initialization of some MassFormer models. Copied from this url. Please refer to the Graphormer repository for more information.
创建时间:
2023-10-03
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作