five

Mid-infrared spectral dataset and machine learning tools for defect detection in green and roasted coffee

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://data.mendeley.com/datasets/dtd2fggpyp
下载链接
链接失效反馈
官方服务:
资源简介:
This repository provides a comprehensive mid-infrared (FTIR) spectral dataset of defect-free and defective coffee beans at both green and roasted processing stages, together with spectral preprocessing workflows and machine learning tools for defect classification. The dataset includes spectra acquired in the wavenumber range of 4000–650 cm⁻¹ using ATR-FTIR spectroscopy, representing Control samples (defect-free) and five industry-relevant defect categories: bitten, discolored, insect-damaged (drill bit), sour (vinegar), and black defects. To support robust spectral analysis and facilitate reproducible modeling, the repository also includes spectra preprocessed using commonly applied chemometric techniques, including baseline correction, Standard Normal Variate (SNV), Multiplicative Scatter Correction (MSC), and Savitzky–Golay first and second derivative transformations. These preprocessing methods allow users to evaluate the influence of spectral correction strategies on classification performance and feature extraction. In addition to the spectral datasets, this repository provides R-based machine learning workflows for the simultaneous classification of defective and non-defective coffee samples in both green and roasted states. The computational tools include Support Vector Machine (SVM) and Random Forest (RF) algorithms, together with scripts for data preprocessing, model calibration, validation, and performance evaluation. These tools enable reproducible development and benchmarking of classification models for spectroscopy-based food quality assessment. This dataset may be valuable for researchers in food science, spectroscopy, chemometrics, and machine learning, as well as for coffee industry stakeholders interested in developing rapid, non-destructive quality control systems. Furthermore, the availability of both spectral data and computational tools facilitates reuse in applications such as food authentication, defect detection, and the development of data-driven quality monitoring strategies in agri-food systems.
创建时间:
2026-04-10
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作