five

Replication Data for: Integrating C-H Information to Improve Machine Learning Classification Models for Microplastic Identification from Raman Spectra

收藏
DataCite Commons2025-11-20 更新2025-04-09 收录
下载链接:
https://borealisdata.ca/citation?persistentId=doi:10.5683/SP3/KUS7OB
下载链接
链接失效反馈
官方服务:
资源简介:
The development of uniform, consistent spectroscopic databases of Raman spectra is important for the community to maximize the value of emerging machine learning techniques. This dataset contains processed and augmented Raman spectra acquired on a variety of common plastics, with variations in manufacturer and properties such as plastic color. The Raman spectra span the frequency window from 300 to 3900 cm-1, were collected using variations in instrumentation settings, were interpolated to 1 cm-1 wavenumber spacing to ensure compatibility, and were augmented 5X by random scaling and artificial noise introduction. Three different versions of the data are provided, each enabling exploration of a different strategy for training machine learning classification models. This data was used to train microplastic classification models using K-nearest neighbor algorithm of the sklearn package in python, as published in the associated manuscript. Python pickle files are included in the dataset, which contain the optimized models and supporting information for the models. The data are being posted in support of this research. The data was created by the authors.
提供机构:
Borealis
创建时间:
2024-09-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作