five

Gas Chromatography-Mass Spectrometry Long-term Instrumental Drift Data over 155 days

收藏
Figshare2025-10-06 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Gas_Chromatography-Mass_Spectrometry_Long-term_Instrumental_Drift_Data_over_155_days/30284446
下载链接
链接失效反馈
官方服务:
资源简介:
Over a span of 155 days, we performed 20 repeated tests in 7 batches on smoke composition of six commercial tobacco products using gas chromatography-mass spectrometry. By measuring pooled quality control (QC) samples for 20 times to establish correction algorithm parameters, we reached reliable correction even for data with large fluctuation. Three algorithms-spline interpolation (SC), support vector regression (SVR), and Random Forest (RF)—were used to perform normalisation on 178 target substances in six samples. For chemical components present in the test samples but absent in the QC samples, normalisation was done by using either adjacent chromatography peak for correction or by applying the average correction coefficients derived from all peaks. Results show that Random Forest algorithm provides the most stable correction model for long-term, highly variable data. With principal component analysis (PCA) and standard deviation analysis we confirm satisfactory correction performance. Correction models based on the SC and SVR algorithms showed less stable correction outcomes. For data with large variation, SVR tends to over-fit and over-correct. Our study shows that for long-term data measurements by gas chromatography-mass spectrometry, QC sample measurements combined with appropriate algorithm for correction can compensate measurement variability, thus enabling reliable data tracking and quantitative comparison over extended periods. The raw GC-MS data in this database contains mzxml format files for QC and six different samples obtained over 155 days.
创建时间:
2025-10-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作