five

CANOPUS evaluation data

收藏
Figshare2020-10-15 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/CANOPUS_evaluation_data/13073051/1
下载链接
链接失效反馈
官方服务:
资源简介:
We evaluated CANOPUS on two MS/MS reference datasets: The SVM training dataset, which was also used for training CSI:FingerID (in 10-fold cross-validation), and the Agilent MassHunter library, used as indepenent dataset.<br>The SVM training dataset contains spectra from GNPS, MassBank, and NIST17. As NIST17 is a commercial library, we can only provide the spectra from GNPS and MassBank. Here, we provide the public part of the SVM training dataset (svm_training_data.zip).<br><br>For training the deep neural network we used a subset of PubChem with 1,106,938 structures for which we downloaded ClassyFire annotations (Feunang et al 2016) and another set of 2,997,933 compounds from the ClassyFire database. The PubChem structures, together with ClassyFire annotations for the evaluation data are available as "structures.csv.gz".<br>With CANOPUS, we analyzed data from two biological studies; the mzML and mzXML files are available at MassIVE (https://massive.ucsd.edu/) with the accession numbers MSV000079949 (mice data, Quinn et al 2020) and MSV000081082 (Euphorbia plant data, Ernst et al 2019).The network visualization of the mice data was done using Cytoscape (Shannon et al 2003). Here, we provide the Cytoscape file (mice_multiple_classes.cys).The source code of CANOPUS is part of the SIRIUS GitHub repository (https://github.com/boecker-lab/sirius-libs). The scripts we used for analyzing and visualizing the data are available at the GitHub repository (https://github.com/kaibioinfo/canopus_treemap).<br><br>See the LICENSE.txt for further licensing information on Classyfire annotations and mass spectra.
创建时间:
2020-10-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作