five

Customizable Machine Learning Models for Rapid Microplastic Identification Using Raman Microscopy

收藏
DataONE2024-09-24 更新2025-04-26 收录
下载链接:
https://search.dataone.org/view/sha256:eb841ea00b6bb9c12eee6d06657b29db09cd8a87a136f2e00032cca212e18298
下载链接
链接失效反馈
官方服务:
资源简介:
Variations in Raman spectroscopic instrumentation alter data structure, introducing inconsistencies that disrupt the development of community-wide analytical tools. This dataset consists of Raman spectra for a variety of common plastics full-window Raman spectra that are both high resolution (<1 cm-1 wavenumber spacing) and span the full range of 100 to 4000 cm-1. The utility of this data structure for creating advanced data analysis tools is demonstrated by using the data to train several different classification models, then applying the models to classify spectra acquired on 2-dimensional Raman microscopic maps of diverse plastic microparticles. Specifically, the sklearn package in python is used to train models based on random-forest, K-nearest neighbors, and multi-layer perceptron algorithms. This dataset provides flexibility to downgrade the spectroscopic resolution of the data such that classification models can be tailored for individual instrument setups: sample tests show that high classification accuracy is maintained even when downgrading the Raman shift spacing to 1, 2, 4, or 8 cm–1. The training data were created by the authors. The data were also tested using Raman spectra obtained from the public domain.

拉曼光谱仪器的差异会改变数据结构,引入不一致性,进而阻碍面向全科研社区的分析工具的开发。本数据集包含多种常见塑料的全窗口拉曼光谱:其兼具高分辨率(波数间距小于1 cm⁻¹)且覆盖100至4000 cm⁻¹的全波段范围。本数据集的数据结构在构建高级数据分析工具中的效用可通过如下方式验证:利用该数据集训练多种分类模型,随后将模型应用于对不同塑料微粒的二维拉曼显微图谱所采集的光谱进行分类。具体而言,本研究采用Python的scikit-learn(sklearn)工具包,基于随机森林、K近邻以及多层感知器算法训练模型。本数据集支持对光谱分辨率进行降级处理,从而可针对不同仪器配置定制分类模型:样本测试表明,即便将拉曼位移间距降至1、2、4或8 cm⁻¹,模型仍可保持较高的分类准确率。训练数据集由本研究团队自主构建,同时也采用公开领域获取的拉曼光谱对该数据集进行了测试验证。
创建时间:
2024-10-02
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作