five

Hyperspectral images of bulk grain samples for variety classification

收藏
DataCite Commons2022-03-14 更新2024-07-13 收录
下载链接:
https://erda.ku.dk/archives/89a8b2b044d458e487fc2ce56927f420/published-archive.html
下载链接
链接失效反馈
官方服务:
资源简介:
Hyperspectral image database used for the paper "Study of hyperspectral imaging for classification of bulk grain samples with deep convolutional neural networks" in journal of near infrared spectroscopy 2022. The database contains about 446 labeled hyperspectral NIR images of 8 different grain samples with 50-600 kernels in each image. The database is separated in training, validation, and test dataset. The database can be downloaded and used freely for non-commercial research and educational purposes under the condition that the original paper is cited. The authors make no warranties regarding the database and its fitness for a particular purpose. The user of the database accepts full responsibility for his or her use of the database. The database contains 8 grain samples: 4 conventional wheat varieties from 2018 of which 1 is a spring wheat and 3 are conventional winter wheat varieties of high baking quality (Type A) and low/medium baking quality (Type B), 2 organic spring wheat heritage varieties, named Halland and Øland wheat and one an organic spelt (dinkel wheat). The last grain sample was a Midsummer Rye. The 8 samples were all stored under similar dry conditions for at least a year. All 8 samples were divided into training, validation, and test sets. Information of the three datasets can be seen in table 1, in the paper. The samples are measured in 3 sequences in a random order. For each training sample, 10 images of densely packed kernels were acquired followed by 6 images of sparsely packed kernels. For both the validation and test sets, 5 dense images followed by 3 sparse images were acquired. The amount of grains within one dense image varied between 150 and 600 kernels per image, depending on grain variety. The sparse images contain about 50 to 100 kernels depending on the size of the grain kernel type. This image sequence was repeated 3 times, each time having a different random order of samples, on three separate days. The test set was only measured on the first and third measurement day. Between the second and third measurement day, a small adjustment of the hyperspectral camera position and focus was made to introduce variations in the the experimental setup. Each image was acquired by placing the kernels in a sample tray which were presented to the FX-17 Specim HSI camera (range 900-1700 nm) through using a sample conveyer. In front and back of the sample tray two white PTFE foil was placed to generate white references for each image. The pixel size was standardized using checkerboards placed in front and back of the sample tray. Dark images were acquired before each sample for each sequence. Notice that the FX17 camera is a linescan camera and outputs two dimensional images (w# x c#) with one spatial (w#) and one spectral dimension (c#). The hyperspectral image is constructed through adding the series of camera outputs each representing a different timepoint and thereby conveyer and sample position thereby generate the hypespectral image (h# x w# x c#). The hyperspectral image in the database is stored in .cdf data format with uint16 precision and are corrected for dark noise in each camera pixel, normalized with white value for each camera pixel and the image pixel size have been standardized to 0.15 x 0.15 mm^2. See LOADDATA.py for example how to read the uint16 data in the .cdf datafiles and convert to absorbance as well as explanation for additional metadata. About the stored data values: Raw hyperspectral image --> D Dark value for each camera pixel--> DTN White value for each camera pixel--> WTN Reflection image --> R=(D-DTN)/(WTN-DTN) Data is normalized to fit uint16 --> DN=(R-min(R))/(max(R)-min(R))*(2**16-1) The data DN is saved in cdf file. The database is structured in subfolder according to "/dataset/sample_type/" i.e. "/Training/Wheat_H1/" and hyperspectral image data has the filenames "Series#_datetime.cdf". "Series#" refers to the image sequence name and number and "datetime" to the date and time of the image acquisition. i.e. "Dense_Series1_20_09_07_14_40_09.cdf". The database also contains pseudo rgb images generated from the hyperspectral images named "Series#_datetime.jpg", as well as np arrays containing hyperspectral images of white reference PTFE foil for each file in "WHITESeries#_datetime.npy" and pseudo rgb images of white reference data "WHITESeries#_datetime.jpg".
提供机构:
University of Copenhagen
创建时间:
2022-03-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作