five

Dataset B of Paper "Hidden in Plain Signs: Realistic Sticker Attacks on Production Traffic Sign Recognition Systems"

收藏
DataCite Commons2026-05-04 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.19677353
下载链接
链接失效反馈
官方服务:
资源简介:
⚠️ Academic Use Only — Non-Commercial Research Dataset This dataset is a derivative work assembled exclusively for academic research purposes. It inherits Non-Commercial (NC) restrictions from its source datasets. See the License section for full details. Overview This dataset was compiled for academic research on traffic sign detection and recognition. It aggregates and preprocesses images from five publicly available benchmark datasets, applying cropping and resizing transformations to meet the input requirements of the deep learning models evaluated in our work. Key facts: Purpose: Academic research only (non-commercial) Task: Traffic sign detection / recognition Images: 30,226 Classes: 17 Splits: Train / Validation / Test Image format: JPEG Dataset Composition This dataset is a derivative work combining images from the following source datasets. Each subset retains the license of its original source. # Source Dataset Images Used Classes Used 1 Mapillary Traffic Sign Dataset (MTSD) 13,662 Limits 10-120 km/h, Stop, No Vehicles, No Stopping, No Parking, No Entry 2 DFG Traffic Sign Dataset 2,360 Limits 10, 30-70 km/h, Stop, No Vehicles, No Stopping, No Parking, No Entry 3 TT100k 6,605 Limits 10-120 km/h, No Vehicles, No Stopping, No Parking, No Entry 4 GTSDB (German Traffic Sign Detection Benchmark) 357 Limits 20, 30, 50-80, 100, 120 km/h, Stop, No Entry 5 ItalianSigns 361 all (Limits 20-90 km/h) Due to the preprocessing phase (zooming/cropping), some images were split into multiple patches. The total number of images in this dataset is therefore greater than the sum of the "Images Used" column above. Dataset Structure dataset/ ├── LICENSE.txt ├── scripts/ │ └── adapt_sources # Pipelines to extract data from original sources, one script per dataset│       └── Mapillary.py│       └── DFG.py│       └── TT100k.py│       └── GTSDB.py│       └── ItalianSigns.py│ └── zoom_and_crop.py # Script to be applied after download of the sources, to zoom images with very small objects ├── images/ │ ├── train/ │ └── test/│ └── val/ ├── labels/ # One .txt file per image, with the annotations in YOLO format. │ ├── train/ │ └── test/│ └── val/ Label format: YOLO .txt Preprocessing Once downloaded the official sources, images were processed using the scripts available in the /scripts folder of this repository, to extract the relevant images for our purposes and meet model input requirements. In particular: Images extraction: The scripts in /scripts/adapt_sources have been run, one for each dataset. They select only the relevant images, and convert their annotations using our labeling and the YOLO format. The images are renamed using the name of the source dataset with an incremental numeric suffix (e.g., `DFG (0).jpg`, `DFG (1).jpg`).  Cropping and zooming: The script /scripts/zoom_and_crop.py has been run on the images obtained after step 1. It zooms in on the images with very small traffic signs, to better isolate the sign region. When an image is split into multiple patches during such process, an incremental numeric suffix is appended to the obtained sub-images  (e.g., `DFG (0)_0.jpg`, `DFG (0)_1.jpg`). Finally, the dataset is split into training (70%), testing (15%) and validation (15%) images. License This dataset is a derivative work and is released under CC BY-NC-SA 4.0 (Creative Commons Attribution – NonCommercial – ShareAlike 4.0 International), which is the most restrictive license among those of the contributing source datasets that require ShareAlike terms. In summary, you are free to: Share — copy and redistribute this dataset in any medium or format Adapt — preprocess, crop, or otherwise transform the material Under the following terms: Attribution (BY) — You must give appropriate credit to all original source datasets (see Citations). NonCommercial (NC) — You may not use this dataset for commercial purposes. ShareAlike (SA) — If you build upon this dataset, you must distribute your derivative work under the same CC BY-NC-SA 4.0 license. Per-source license summary Source License License Link Mapillary TSD CC BY-NC-SA 4.0 https://creativecommons.org/licenses/by-nc-sa/4.0/ DFG Traffic Sign Dataset CC BY-NC-SA 4.0 https://creativecommons.org/licenses/by-nc-sa/4.0/ TT100k CC BY-NC 4.0 https://creativecommons.org/licenses/by-nc/4.0/ GTSDB No explicit license stated — research use only per original authors https://benchmark.ini.rub.de/gtsdb_dataset.html ItalianSigns GNU LGPLv3 https://www.gnu.org/licenses/lgpl-3.0.html Disclaimer: The authors of this compiled dataset are not lawyers and this summary does not constitute legal advice. Users are responsible for verifying compliance with each source dataset's license before use. Note on GTSDB licensing: The GTSDB was published by the Institut für Algorithmen und Kognitive Systeme (KIT). Please refer to the original dataset page for the authoritative license terms before reusing this subset. Citations If you use this dataset in your research, please cite all original source datasets listed below. Source dataset citations 1. Mapillary Traffic Sign Dataset (MTSD) @inproceedings{ertler2020mapillary,  title={The mapillary traffic sign dataset for detection and classification on a global scale},  author={Ertler, Christian and Mislej, Jerneja and Ollmann, Tobias and Porzi, Lorenzo and Neuhold, Gerhard and Kuang, Yubin},  booktitle={European conference on computer vision},  pages={68--84},  year={2020},  organization={Springer}} 2. DFG Traffic Sign Dataset @article{Tabernik2019ITS,    author = {Tabernik, Domen and Sko{\v{c}}aj, Danijel},    journal = {IEEE Transactions on Intelligent Transportation Systems},    title = {{Deep Learning for Large-Scale Traffic-Sign Detection and Recognition}},    year = {2019},    doi={10.1109/TITS.2019.2913588},     ISSN={1524-9050} } 3. TT100k @InProceedings{Zhe_2016_CVPR, title = {Traffic-Sign Detection and Classification in the Wild}, author = {Zhu, Zhe and Liang, Dun and Zhang, Songhai and Huang, Xiaolei and Li, Baoli and Hu, Shimin}, booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, year = {2016} } 4. GTSDB (German Traffic Sign Detection Benchmark) @inproceedings{houben2013gtsdb, title = {Detection of Traffic Signs in Real-World Images: The {German Traffic Sign Detection Benchmark}}, author = {Houben, Sebastian and Stallkamp, Johannes and Salmen, Jan and Schlipsing, Marc and Igel, Christian}, booktitle = {International Joint Conference on Neural Networks (IJCNN)}, year = {2013}, doi = {10.1109/IJCNN.2013.6706807} } 5. ItalianSigns @misc{ItalianSigns,  author={Daniel Rossi and Riccardo Salami},  title={ItalianSigns},  year={2022},  howpublished={\url{https://www.kaggle.com/datasets/officialprojecto/italiansigns}}} Acknowledgements We thank the authors and institutions that made their datasets publicly available for research purposes: The Mapillary team for the MTSD dataset. Domen Tabernik and Danijel Skočaj for the DFG Traffic Sign Dataset. Zhe Zhu et al. for the TT100k dataset. Sebastian Houben et al. for the GTSDB. Daniel Rossi and Riccardo Salami for the ItalianSigns traffic sign dataset. Contact This dataset was submitted anonymously for peer review. After the review process, author information and the associated paper reference will be added here. For questions regarding licensing and reuse, please open an issue on this repository or contact the corresponding author after de-anonymization. Last updated: 2026 Dataset version: 1.0

⚠️ 仅用于学术用途——非商业研究数据集 本数据集为仅为学术研究目的汇编的衍生作品,继承了源数据集的非商业(NC)使用限制,完整细节请参见许可证章节。 ## 概述 本数据集专为交通标志检测与识别的学术研究而汇编,整合并预处理了5个公开基准数据集的图像,通过裁剪与缩放变换适配本研究中评估的深度学习模型的输入要求。 ## 核心信息 - 用途:仅用于学术研究(非商业) - 任务:交通标志检测与识别 - 图像总数:30226张 - 类别数:17类 - 数据集划分:训练集/验证集/测试集 - 图像格式:JPEG ## 数据集组成 本数据集为整合以下源数据集的衍生作品,每个子集保留其原始源的许可证条款。 | 序号 | 源数据集 | 使用图像数 | 使用类别 | |------|----------|------------|----------| | 1 | Mapillary交通标志数据集(MTSD) | 13662 | 限速10-120 km/h、停车让行、禁止车辆通行、禁止停留、禁止停车、禁止驶入 | | 2 | DFG交通标志数据集 | 2360 | 限速10、30-70 km/h、停车让行、禁止车辆通行、禁止停留、禁止停车、禁止驶入 | | 3 | TT100k | 6605 | 限速10-120 km/h、禁止车辆通行、禁止停留、禁止停车、禁止驶入 | | 4 | 德国交通标志检测基准数据集(GTSDB) | 357 | 限速20、30、50-80、100、120 km/h、停车让行、禁止驶入 | | 5 | ItalianSigns | 361 | 全部(限速20-90 km/h) | 由于预处理阶段(缩放与裁剪)的操作,部分图像会被分割为多个图像块,因此本数据集的总图像数大于上述"Images Used"列的总和。 ## 数据集结构 dataset/ ├── LICENSE.txt ├── scripts/ │ └── adapt_sources # 用于从原始源提取数据的流水线,每个数据集对应一个脚本 │ └── Mapillary.py │ └── DFG.py │ └── TT100k.py │ └── GTSDB.py │ └── ItalianSigns.py │ └── zoom_and_crop.py # 下载源数据后运行的脚本,用于对包含极小目标的图像进行缩放裁剪 ├── images/ │ ├── train/ │ ├── test/ │ └── val/ ├── labels/ # 每张图像对应一个.txt标注文件,采用YOLO格式标注 │ ├── train/ │ ├── test/ │ └── val/ 标注格式:YOLO .txt ## 预处理流程 下载官方源数据集后,使用本仓库/scripts文件夹下的脚本进行处理,以筛选出所需图像并适配模型输入要求,具体步骤如下: 1. **图像提取**:针对每个源数据集,分别运行/scripts/adapt_sources下的脚本,仅筛选出所需图像,并将标注转换为我们的标注体系与YOLO格式。图像将以源数据集名称加递增数字后缀的方式重命名(例如:`DFG (0).jpg`、`DFG (1).jpg`)。 2. **裁剪与缩放**:针对步骤1得到的图像运行/scripts/zoom_and_crop.py脚本,对包含极小交通标志的图像进行放大操作,以更好地隔离标志区域。若在此过程中图像被分割为多个图像块,则为生成的子图像追加递增数字后缀(例如:`DFG (0)_0.jpg`、`DFG (0)_1.jpg`)。 最终,本数据集按70%训练集、15%测试集、15%验证集的比例进行划分。 ## 许可证 本数据集为衍生作品,采用知识共享署名-非商业性使用-相同方式共享4.0国际版(CC BY-NC-SA 4.0)许可,该许可为所有贡献源数据集要求相同方式共享条款中最严格的许可。 简而言之,您可以: - **共享**:以任何媒介或形式复制和重新分发本数据集 - **改编**:对数据集进行预处理、裁剪或其他形式的变换 需遵守以下条款: - **署名(BY)**:您必须为所有原始源数据集提供适当的署名(参见参考文献部分)。 - **非商业使用(NC)**:您不得将本数据集用于商业用途。 - **相同方式共享(SA)**:如果您基于本数据集进行二次创作,必须以相同的CC BY-NC-SA 4.0许可证分发您的衍生作品。 ### 各源数据集许可证摘要 | 源数据集 | 许可证 | 许可证链接 | |----------|--------|------------| | Mapillary TSD | CC BY-NC-SA 4.0 | https://creativecommons.org/licenses/by-nc-sa/4.0/ | | DFG Traffic Sign Dataset | CC BY-NC-SA 4.0 | https://creativecommons.org/licenses/by-nc-sa/4.0/ | | TT100k | CC BY-NC 4.0 | https://creativecommons.org/licenses/by-nc/4.0/ | | GTSDB | 未明确声明许可证——原始作者要求仅可用于研究 | https://benchmark.ini.rub.de/gtsdb_dataset.html | | ItalianSigns | GNU LGPLv3 | https://www.gnu.org/licenses/lgpl-3.0.html | ## 免责声明 本汇编数据集的作者并非法律专业人士,本摘要不构成法律建议。使用者需在使用前自行验证是否符合各源数据集的许可证要求。 ### GTSDB许可证说明 GTSDB由算法与认知系统研究所(KIT)发布。在复用该子集前,请参阅原始数据集页面获取权威的许可证条款。 ## 参考文献 如果您在研究中使用本数据集,请引用以下列出的所有原始源数据集。 ### 源数据集参考文献 1. Mapillary交通标志数据集(MTSD) bibtex @inproceedings{ertler2020mapillary, title={"The mapillary traffic sign dataset for detection and classification on a global scale"}, author={Ertler, Christian and Mislej, Jerneja and Ollmann, Tobias and Porzi, Lorenzo and Neuhold, Gerhard and Kuang, Yubin}, booktitle={"European conference on computer vision"}, pages={68--84}, year={2020}, organization={Springer} } 2. DFG交通标志数据集 bibtex @article{Tabernik2019ITS, author={Tabernik, Domen and Skočaj, Danijel}, journal={"IEEE Transactions on Intelligent Transportation Systems"}, title={"{Deep Learning for Large-Scale Traffic-Sign Detection and Recognition}"}, year={2019}, doi={"10.1109/TITS.2019.2913588"}, ISSN={"1524-9050"} } 3. TT100k bibtex @InProceedings{Zhe_2016_CVPR, title={"Traffic-Sign Detection and Classification in the Wild"}, author={Zhu, Zhe and Liang, Dun and Zhang, Songhai and Huang, Xiaolei and Li, Baoli and Hu, Shimin}, booktitle={"The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)"}, year={2016} } 4. 德国交通标志检测基准数据集(GTSDB) bibtex @inproceedings{houben2013gtsdb, title={"Detection of Traffic Signs in Real-World Images: The {German Traffic Sign Detection Benchmark}"}, author={Houben, Sebastian and Stallkamp, Johannes and Salmen, Jan and Schlipsing, Marc and Igel, Christian}, booktitle={"International Joint Conference on Neural Networks (IJCNN)"}, year={2013}, doi={"10.1109/IJCNN.2013.6706807"} } 5. ItalianSigns bibtex @misc{ItalianSigns, author={Daniel Rossi and Riccardo Salami}, title={"ItalianSigns"}, year={2022}, howpublished={"url{https://www.kaggle.com/datasets/officialprojecto/italiansigns}"} } ## 致谢 感谢所有将数据集公开以供学术研究使用的作者与机构: - Mapillary团队发布的MTSD数据集 - Domen Tabernik与Danijel Skočaj发布的DFG交通标志数据集 - Zhe Zhu等发布的TT100k数据集 - Sebastian Houben等发布的GTSDB数据集 - Daniel Rossi与Riccardo Salami发布的ItalianSigns交通标志数据集 ## 联系方式 本数据集为匿名提交以供同行评审。评审流程结束后,将在此处添加作者信息与相关论文引用。 有关许可证与复用的问题,请在此仓库中提交Issue,或在作者信息公开后联系通讯作者。 最后更新:2026年 数据集版本:1.0
提供机构:
Zenodo
创建时间:
2026-05-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作