Chinese Chemical Safety Signs (CCSS)

NIAID Data Ecosystem2026-03-14 收录

下载链接：

https://zenodo.org/record/5482333

下载链接

链接失效反馈

官方服务：

资源简介：

Notice: We have currently a paper under double-blind review that introduces this dataset. Therefore, we have anonymized the dataset authorship. Once the review process has concluded, we will update the authorship information of this dataset. Chinese Chemical Safety Signs (CCSS) This dataset is compiled as a benchmark for recognizing chemical safety signs from images. We provide both the dataset and the experimental results at doi:10.5281/zenodo.5482334. 1. The Dataset The complete dataset is contained in the folder ccss/data in archive css_data.zip. The images include signs based on the Chinese standard "Safety Signs and their Application Guidelines" (GB 2894-2008) for safety signs in chemical environments. This standard, in turn, refers to the standards ISO 7010 (Graphical symbols – Safety Colours and Safety Signs – Safety signs used in workplaces and public areas), GB/T 10001 (Public Information Graphic Symbols for Signs), and GB 13495 (Fire Safety Signs) 1.1. Image Collection We collect photos commonly used chemical safety signs in chemical laboratories and chemical teaching buildings. For a discussion of the standards we base our collections, refer to the book "Talking about Hazardous Chemicals and Safety Signs" for common signs, and refer to the safety signs guidelines (GB 2894-2008). The shooting was mainly carried out in 6 locations, namely on the road, in a parking lot, construction walls, in a chemical laboratory, outside near big machines, and inside the factory and corridor. Shooting scale: Images in which the signs appear in small, medium and large scales were taken for each location by shooting photos from different distances. Shooting light: good lighting conditions and poor lighting conditions were investigated. Part of the images contain multiple targets and the other part contains only single signs. Under all conditions, a total of 4650 photos were taken in the original data. These were expanded to 27'900 photos were via data enhancement. All images are located in folder ccss/data/JPEGImages. The file ccss/data/features/enhanced_data_to_original_data.csv provides a mapping between the enhanced image name and the corresponding original image. 1.2. Annotation and Labelling The labelling tool is Labelimg, which uses the PASCAL-VOC labelling format. The annotation is stored in the folder ccss/data/Annotations. Faster R-CNN and SSD are two algorithms that use this format. When training YOLOv5, you can run trans_voc2yolo.py to convert the XML file in PASCAL-VOC format to a txt file. We provide further meta-information about the dataset in form of a CSV file features.csv which notes, for each image, which other features it has (lighting conditions, scale, multiplicity, etc.). 1.3. Dataset Features As stated above, the images have been shot under different conditions. We provide all the feature information in folder ccss/data/features. For each feature, there is a separate list of file names in that folder. The file ccss/data/features/features_on_original_data.csv is a CSV file which notes all the features of each original image. 1.4. Dataset Division The data set is fixedly divided into 7:3 training set and test set. You can find the corresponding image names in the files ccss/data/training_data_file_names.txt and ccss/data/test_data_file_names.txt. 2. Baseline Experiments We provide baseline results with the three models of Faster R-CNN, SSD, and YOLOv5. All code and results is given in folder ccss/experiment in archive ccss_experiment. 2.2. Environment and Configuration Single Intel Core i7-8700 CPU NVIDIA GTX1060 GPU 16 GB of RAM Python: 3.8.10 pytorch: 1.9.0 pycocotools: pycocotools-win Visual Studio 2017 Windows 10 2.3. Applied Models The source codes and results of the applied models is given in folder ccss/experiment with sub-folders corresponding to the model names. 2.3.1. Faster R-CNN backbone: resnet50+fpn. we downloaded the pre-training weights from we modify the type information of the JSON file to match our application. run train_res50_fpn.py finally, the weights trained by the training set. backbone: mobilenetv2 the same training method as resnet50+fpn, but the effect is not as good as resnet50+fpn, so it is directly discarded. The Faster R-CNN source code used in our experiment is given in folder ccss/experiment/sources/faster_rcnn. The weights of the fully-trained Faster R-CNN model are stored in file ccss/experiment/trained_models/faster_rcnn.pth. The performance measurements of Faster R-CNN are stored in folder ccss/experiment/performance_indicators/faster_rcnn. 2.3.2. SSD backbone: resnet50 we downloaded pre-training weights from the same training method as Faster R-CNN is applied. The SSD source code used in our experiment is given in folder ccss/experiment/sources/ssd. The weights of the fully-trained SSD model are stored in file ccss/experiment/trained_models/ssd.pth. The performance measurements of SSD are stored in folder ccss/experiment/performance_indicators/ssd. 2.3.4. YOLOv5 backbone: CSP_DarkNet we modified the type information of the YML file to match our application run trans_voc2yolo.py to convert the XML file in VOC format to a txt file. the weights used are: yolov5s. The YOLOv5 source code used in our experiment is given in folder ccss/experiment/sources/yolov5. The weights of the fully-trained YOLOv5 model are stored in file ccss/experiment/trained_models/yolov5.pt. The performance measurements of YOLOv5 are stored in folder ccss/experiment/performance_indicators/yolov5. 2.4. Evaluation The computed evaluation metrics as well as the code needed to compute them from our dataset are provided in the folder ccss/experiment/performance_indicators. They are provided over the complete test st as well as separately for the image features (over the test set). 3. Code Sources Faster R-CNN official code: SSD official code: YOLOv5 We are particularly thankful to the author of the GitHub repository WZMIAOMIAO/deep-learning-for-image-processing (with whom we are not affiliated). Their instructive videos and codes were most helpful during our work. In particular, we based our own experimental codes on his work (and obtained permission to include it in this archive). 4. Licensing While our dataset and results are published under the Creative Commons Attribution 4.0 License, this does not hold for the included code sources. These sources are under the particular license of the repository where they have been obtained from (see Section 3 above).

创建时间：

2023-03-21

5,000+

优质数据集

54 个

任务类型

进入经典数据集