HREI-MSDB: High-resolution electron ionization mass spectral database for diverse volatile compounds
收藏DataCite Commons2025-08-01 更新2025-09-08 收录
下载链接:
https://figshare.com/articles/dataset/HREI-MSDB_High-resolution_electron_ionization_mass_spectral_database_for_diverse_volatile_compounds/29713460
下载链接
链接失效反馈官方服务:
资源简介:
This repository contains a database of high-resolution electron ionization (EI) mass spectra recorded under gas chromatography - mass spectrometry (GC-MS) conditions. The vast majority of publicly available GC-MS data sets are obtained using low-resolution mass spectrometry. Few exceptions are the works E.J. Price, 2021, and V.Castro, 2022. At the same time, gas chromatography-high-resolution mass spectrometry (GC-HRMS) is used quite often in studies.This database aimed to create a GC-HRMS data set covering the diverse classes of volatile compounds (trimethylsilyl- derivatives are not included!), using a wide <i>m</i>/<i>z</i> range (starting from <i>m</i>/<i>z</i> = 40). Mass spectra were recorded using an Orbitrap Exploris GC mass detector (Thermo Fisher Scientific, USA). The mass determination error is no more than 0.0006 Da, and the mass spectral resolution value is 30000. All mass spectra were checked manually; the <code>.zip</code> archives contain information on peak annotations. The <code>data.xlsx</code> file contains a list of compounds and spectra IDs. Peaks with intensity less than <b>1/999</b> of the most intense were discarded.The data set includes:130 mass spectra of pure compounds recorded using GC-MS of 10-molecule batches or GC-MS of individual compound solutions.61 mass spectra of compounds included in the 8270 MegaMix standard compound mixture.45 mass spectra of volatile compounds included in lavender essential oil.38 mass spectra of volatile compounds included in mint essential oil.33 mass spectra of volatile compounds included in lemon essential oil.22 mass spectra of volatile compounds included in coffee.These groups of spectra are designated as <code>Pure samples</code>, <code>8270 MegaMix Standard</code>, <code>Lavender (essential oil)</code>, <code>Mint (essential oil)</code>, <code>Lemon (essential oil)</code>, and <code>Coffee</code>, respectively in the <code>data.xlsx</code> file and in the "Comments" tag in the MSP files. <b>Please note which spectrum was obtained in what way.</b> Identification of compounds in essential oils and coffee is quite reliable, but it was still performed without using standard samples.For convenience, in some cases (for essential oils), SMILES are provided using symbols denoting stereoisomers, but we cannot be sure that we really know which stereoisomer we are considering: often, both the retention indices and mass spectra are very close.Detailed information on the experimental conditions under which the spectra were obtained, on the equipment, and data processing is contained in the <code>info.pdf</code> file. The <code>quality_assessment.xlsx</code> file contains data obtained during quality control of the mass spectra (see the <code>info.pdf</code> file for additional information).Each file named all_spectra contains all spectra (both those obtained using the sample collection and those obtained from essential oil and coffee samples) in different file formats. <b>Most likely, you need the all_spectra.msp file (NIST-compatible), it contains all the data.</b> The <code>plant_volatiles.msp</code> file contains all mass spectra obtained from essential oils and coffee. The names of the remaining files are self-explanatory. If you need annotations of all peaks or more file formats, then look at the <code>.zip</code> archives. <b>JCAMP (.jdx) files are in the .zip archives.</b>Processing (interpretation) of mass spectra was done using our software:<br>https://github.com/mtshn/gchrmsexplain versions <code>0.0.2</code> and <code>0.0.3</code>.<br>The settings used are given in the <code>info.pdf</code> file; however, these settings are the default for the corresponding versions.<br>Levels of explanation of each peak in the mass spectrum:<b>Level 1</b> - the molecular formula is selected, but some isotopic peaks are not found at all<br><b>Level 2</b> - isotopic peaks merge with other peaks. For example, the 13C peak of some ion X is superimposed (taking into account the resolution) on the main peak X + H. At not very high resolutions, such peaks may not be resolved. This also includes cases of "incorrect" isotopic peak intensity, differing from the theoretically calculated one.<br><b>Level 3</b> - all main isotopic peaks are observed correctly, up to the accuracy of mass determination.The minimum number of bonds that must be broken to obtain such a fragment is indicated without taking into account the loss of hydrogens, as well as without some other "trivial" bond breaks: the loss of a halogen atom, a methyl group, NO-loss from a nitro-group. Details are given in the documentation of the software used to process the mass spectra: https://github.com/mtshn/gchrmsexplain.In files containing abbreviated interpretations of mass spectra (e.g., in <code>CSV_annotated</code> folders in <code>.zip</code> archives), notations like <code>3-1</code> are used. The first number denotes the interpretation level (see above), and the second denotes the number of (non-trivial) bond breaks required to obtain such a molecular formula.
提供机构:
figshare
创建时间:
2025-07-31



