CSEM-MISD - CSEM's Multi-Illumination Surface Defect Detection Dataset
收藏Mendeley Data2024-05-10 更新2024-06-30 收录
下载链接:
https://zenodo.org/records/7410513
下载链接
链接失效反馈官方服务:
资源简介:
In automated surface visual inspection, it is often necessary to capture the inspected part under many different illumination conditions to capture all the defects. To address this issue, at CSEM we have acquired a real-world multi-illumination defect segmentation dataset, called CSEM-MISD and we release it for research purposes to benefit the community. The dataset consists of three different types of metallic parts -- washers, screws, and gears. Parts were captured in a half-spherical light-dome system that filtered out all the ambient light and successively illuminated it from 108 distinct illumination angles. Each 12 illumination angles share the same elevation level and the relative azimuthal difference between the adjacent illumination angles on the same level is 30 degrees. For more details, please read Sections 3 and 4 of our paper. The washers dataset features 70 defective parts. The gears and screws datasets feature 35 defective, 35 intact and several hundred unannotated parts. Some defects, such as notches and holes, are visible in most images (illuminations) with intensity and texture variations among them, while others, such as scratches, are only visible in a few. We split the datasets into train and test sets. The train sets contain 32 samples, and the test set 38 samples. Each sample comprises 108 images (each captured under a different illumination angle), an automatically extracted foreground segmentation mask, and a hand-labeled defect segmentation mask. This dataset is challenging mainly because: each raw sample consists of 108 gray-scale images of resolution 512×512 and therefore takes 27MB of space; the metallic surfaces produce many specular reflections that sometimes saturate the camera sensors; the annotations are not very precise because the exact extent of defect contours is always subjective; the defects are very sparse also in the spatial dimensions: they cover only about 0.2% of the total image area in gears, 0.8% in screws, and 1.4% in washers; this creates an unbalanced dataset with a highly skewed class representation. The dataset is organized as follows: each sample resides in the Test, Train, or Unannotated directory; each sample has its own directory which contains the individual images, the foreground, and defect segmentation masks; each image is stored in 8-bit greyscale png format and has a resolution of 512 x 512 pixels; Image file names are formatted using three string fields separated with the underscore character: prefix_sampleNr_illuminationNr.png, where the prefix is e.g. washer, the sampleNr might be a three-digit number 001, and the illuminationNr is formed of 3 digits, first corresponding to the elevation index (1 - highest angle, 9 - lowest angle), and the additional two corresponding to the azimuth index (01-12). Each dataset contains light_vectors.csv, which contains the illumination angles (in lexicographic order of the illuminationNr), and light_intensities.csv that contains the numbers corresponding to the light intensity on the scale from 0 to 127. Please, be aware, that the azimuth angles were not calibrated and might be a few degrees misaligned. We provide data loaders implemented in python at the project's repository. If you find our dataset useful, please cite our paper: Honzátko, D., Türetken, E., Bigdeli, S. A., Dunbar, L. A., & Fua, P. (2021). Defect segmentation for multi-illumination quality control systems. Machine vision and Applications.
在自动化表面视觉检测(automated surface visual inspection)中,通常需要在多种不同光照条件下对被检测部件进行成像,以捕获所有缺陷。为解决这一问题,我们在CSEM采集了一个真实世界的多光照缺陷分割数据集(multi-illumination defect segmentation dataset),命名为CSEM-MISD,并将其开源以供科研使用,以惠及学界。该数据集包含三类不同的金属部件:垫圈、螺钉和齿轮。部件的成像在半球形光照穹顶系统(half-spherical light-dome system)中完成,该系统可过滤所有环境光,并依次从108个不同的光照角度对部件进行照明。每12个光照角度共享同一仰角层级,同一层级内相邻光照角度的相对方位角差为30度。如需了解更多细节,请参阅我们论文的第3和第4章节。
垫圈数据集包含70个带缺陷的部件。齿轮和螺钉数据集分别包含35个带缺陷部件、35个完好部件,以及数百个未标注部件。部分缺陷(如缺口与孔洞)在多数光照图像中均可观测到,且不同光照下其强度与纹理存在差异;而另一些缺陷(如划痕)则仅在少量图像中可见。
我们将数据集划分为训练集与测试集:训练集包含32个样本,测试集包含38个样本。每个样本包含108张图像(每张均在不同光照角度下拍摄)、自动提取的前景分割掩码(foreground segmentation mask),以及人工标注的缺陷分割掩码(defect segmentation mask)。
该数据集具有较高的研究挑战性,主要体现在以下方面:
1. 每个原始样本包含108张分辨率为512×512的灰度图像,单样本占用空间达27MB;
2. 金属表面会产生大量镜面反射(specular reflection),有时会使相机传感器过曝;
3. 由于缺陷轮廓的精确范围始终具有主观性,标注精度有限;
4. 缺陷在空间维度上也极为稀疏:齿轮数据集的缺陷仅占图像总区域的约0.2%,螺钉数据集为0.8%,垫圈数据集为1.4%;这导致数据集类别分布极不均衡。
数据集的组织结构如下:
- 每个样本存放在Test、Train或Unannotated目录下;
- 每个样本拥有独立的子目录,其中包含单独的图像文件、前景掩码与缺陷分割掩码;
- 所有图像均以8位灰度PNG格式存储,分辨率为512×512像素;
- 图像文件名采用下划线分隔的三个字符串字段格式:`prefix_sampleNr_illuminationNr.png`,其中prefix例如为washer,sampleNr为三位数字(如001),illuminationNr由三位数字组成:第一位对应仰角索引(1代表最高角度,9代表最低角度),后两位对应方位角索引(01至12)。
每个数据集均包含`light_vectors.csv`文件,该文件按illuminationNr的字典序存储了光照角度信息;同时包含`light_intensities.csv`文件,其中存储了0至127量程对应的光照强度数值。
请注意,方位角未经过校准,可能存在数度的偏差。我们在项目仓库中提供了Python实现的数据加载器(data loader)。
若您认为该数据集对您的研究有所帮助,请引用我们的论文:Honzátko, D., Türetken, E., Bigdeli, S. A., Dunbar, L. A., & Fua, P. (2021). 面向多光照质量检测系统的缺陷分割(Defect segmentation for multi-illumination quality control systems). 《机器视觉与应用(Machine Vision and Applications)》。
创建时间:
2023-06-28
搜集汇总
数据集介绍

背景与挑战
背景概述
CSEM-MISD数据集是一个多光照表面缺陷检测数据集,包含三种金属零件在不同光照条件下的图像,用于研究缺陷分割。数据集提供了108个光照角度的图像、前景分割掩码和缺陷分割掩码,但面临金属表面反射和缺陷稀疏性等挑战。
以上内容由遇见数据集搜集并总结生成



