Dataset B of Paper "Hidden in Plain Signs: Realistic Sticker Attacks on Production Traffic Sign Recognition Systems"
收藏DataCite Commons2026-05-04 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.19677352
下载链接
链接失效反馈官方服务:
资源简介:
⚠️ Academic Use Only — Non-Commercial Research Dataset This dataset is a derivative work assembled exclusively for academic research purposes. It inherits Non-Commercial (NC) restrictions from its source datasets. See the License section for full details.
Overview
This dataset was compiled for academic research on traffic sign detection and recognition. It aggregates and preprocesses images from five publicly available benchmark datasets, applying cropping and resizing transformations to meet the input requirements of the deep learning models evaluated in our work.
Key facts:
Purpose: Academic research only (non-commercial)
Task: Traffic sign detection / recognition
Images: 30,226
Classes: 17
Splits: Train / Validation / Test
Image format: JPEG
Dataset Composition
This dataset is a derivative work combining images from the following source datasets. Each subset retains the license of its original source.
#
Source Dataset
Images Used
Classes Used
1
Mapillary Traffic Sign Dataset (MTSD)
13,662
Limits 10-120 km/h, Stop, No Vehicles, No Stopping, No Parking, No Entry
2
DFG Traffic Sign Dataset
2,360
Limits 10, 30-70 km/h, Stop, No Vehicles, No Stopping, No Parking, No Entry
3
TT100k
6,605
Limits 10-120 km/h, No Vehicles, No Stopping, No Parking, No Entry
4
GTSDB (German Traffic Sign Detection Benchmark)
357
Limits 20, 30, 50-80, 100, 120 km/h, Stop, No Entry
5
ItalianSigns
361
all (Limits 20-90 km/h)
Due to the preprocessing phase (zooming/cropping), some images were split into multiple patches. The total number of images in this dataset is therefore greater than the sum of the "Images Used" column above.
Dataset Structure
dataset/
├── LICENSE.txt
├── scripts/
│ └── adapt_sources # Pipelines to extract data from original sources, one script per dataset│ └── Mapillary.py│ └── DFG.py│ └── TT100k.py│ └── GTSDB.py│ └── ItalianSigns.py│ └── zoom_and_crop.py # Script to be applied after download of the sources, to zoom images with very small objects
├── images/
│ ├── train/
│ └── test/│ └── val/
├── labels/ # One .txt file per image, with the annotations in YOLO format.
│ ├── train/
│ └── test/│ └── val/
Label format: YOLO .txt
Preprocessing
Once downloaded the official sources, images were processed using the scripts available in the /scripts folder of this repository, to extract the relevant images for our purposes and meet model input requirements.
In particular:
Images extraction: The scripts in /scripts/adapt_sources have been run, one for each dataset. They select only the relevant images, and convert their annotations using our labeling and the YOLO format. The images are renamed using the name of the source dataset with an incremental numeric suffix (e.g., `DFG (0).jpg`, `DFG (1).jpg`).
Cropping and zooming: The script /scripts/zoom_and_crop.py has been run on the images obtained after step 1. It zooms in on the images with very small traffic signs, to better isolate the sign region. When an image is split into multiple patches during such process, an incremental numeric suffix is appended to the obtained sub-images (e.g., `DFG (0)_0.jpg`, `DFG (0)_1.jpg`).
Finally, the dataset is split into training (70%), testing (15%) and validation (15%) images.
License
This dataset is a derivative work and is released under CC BY-NC-SA 4.0 (Creative Commons Attribution – NonCommercial – ShareAlike 4.0 International), which is the most restrictive license among those of the contributing source datasets that require ShareAlike terms.
In summary, you are free to:
Share — copy and redistribute this dataset in any medium or format
Adapt — preprocess, crop, or otherwise transform the material
Under the following terms:
Attribution (BY) — You must give appropriate credit to all original source datasets (see Citations).
NonCommercial (NC) — You may not use this dataset for commercial purposes.
ShareAlike (SA) — If you build upon this dataset, you must distribute your derivative work under the same CC BY-NC-SA 4.0 license.
Per-source license summary
Source
License
License Link
Mapillary TSD
CC BY-NC-SA 4.0
https://creativecommons.org/licenses/by-nc-sa/4.0/
DFG Traffic Sign Dataset
CC BY-NC-SA 4.0
https://creativecommons.org/licenses/by-nc-sa/4.0/
TT100k
CC BY-NC 4.0
https://creativecommons.org/licenses/by-nc/4.0/
GTSDB
No explicit license stated — research use only per original authors
https://benchmark.ini.rub.de/gtsdb_dataset.html
ItalianSigns
GNU LGPLv3
https://www.gnu.org/licenses/lgpl-3.0.html
Disclaimer: The authors of this compiled dataset are not lawyers and this summary does not constitute legal advice. Users are responsible for verifying compliance with each source dataset's license before use.
Note on GTSDB licensing: The GTSDB was published by the Institut für Algorithmen und Kognitive Systeme (KIT). Please refer to the original dataset page for the authoritative license terms before reusing this subset.
Citations
If you use this dataset in your research, please cite all original source datasets listed below.
Source dataset citations
1. Mapillary Traffic Sign Dataset (MTSD)
@inproceedings{ertler2020mapillary, title={The mapillary traffic sign dataset for detection and classification on a global scale}, author={Ertler, Christian and Mislej, Jerneja and Ollmann, Tobias and Porzi, Lorenzo and Neuhold, Gerhard and Kuang, Yubin}, booktitle={European conference on computer vision}, pages={68--84}, year={2020}, organization={Springer}}
2. DFG Traffic Sign Dataset
@article{Tabernik2019ITS, author = {Tabernik, Domen and Sko{\v{c}}aj, Danijel}, journal = {IEEE Transactions on Intelligent Transportation Systems}, title = {{Deep Learning for Large-Scale Traffic-Sign Detection and Recognition}}, year = {2019}, doi={10.1109/TITS.2019.2913588}, ISSN={1524-9050} }
3. TT100k
@InProceedings{Zhe_2016_CVPR,
title = {Traffic-Sign Detection and Classification in the Wild},
author = {Zhu, Zhe and Liang, Dun and Zhang, Songhai and Huang, Xiaolei
and Li, Baoli and Hu, Shimin},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2016}
}
4. GTSDB (German Traffic Sign Detection Benchmark)
@inproceedings{houben2013gtsdb,
title = {Detection of Traffic Signs in Real-World Images: The {German Traffic
Sign Detection Benchmark}},
author = {Houben, Sebastian and Stallkamp, Johannes and Salmen, Jan and
Schlipsing, Marc and Igel, Christian},
booktitle = {International Joint Conference on Neural Networks (IJCNN)},
year = {2013},
doi = {10.1109/IJCNN.2013.6706807}
}
5. ItalianSigns
@misc{ItalianSigns, author={Daniel Rossi and Riccardo Salami}, title={ItalianSigns}, year={2022}, howpublished={\url{https://www.kaggle.com/datasets/officialprojecto/italiansigns}}}
Acknowledgements
We thank the authors and institutions that made their datasets publicly available for research purposes:
The Mapillary team for the MTSD dataset.
Domen Tabernik and Danijel Skočaj for the DFG Traffic Sign Dataset.
Zhe Zhu et al. for the TT100k dataset.
Sebastian Houben et al. for the GTSDB.
Daniel Rossi and Riccardo Salami for the ItalianSigns traffic sign dataset.
Contact
This dataset was submitted anonymously for peer review. After the review process, author information and the associated paper reference will be added here.
For questions regarding licensing and reuse, please open an issue on this repository or contact the corresponding author after de-anonymization.
Last updated: 2026 Dataset version: 1.0
提供机构:
Zenodo
创建时间:
2026-05-04



