five

FraunhoferIOSB/Synset-Signset-Germany

收藏
Hugging Face2026-03-16 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/FraunhoferIOSB/Synset-Signset-Germany
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 license_link: LICENSE task_categories: - image-classification - image-segmentation language: - en tags: - traffic sign recognition - synthetic - synset - OCTAS pretty_name: Synset Signset Germany size_categories: - 100K<n<1M configs: - config_name: Cycles data_files: - split: train path: "cycles/train.parquet" - split: validation path: "cycles/validation.parquet" - config_name: OGRE data_files: - split: train path: "ogre/train.parquet" - split: validation path: "ogre/validation.parquet" --- <img src="synset-signset-germany-title-image.png" width=100% /> # Synset Signset Germany <!-- Provide a quick summary of the dataset. --> The <em>Synset Signset Germany</em> dataset addresses the task of traffic sign recognition in Germany. It contains a total of 105,500 images of 211 different German traffic sign classes, including newly published (2020) and thus comparatively rare traffic signs. The subset of the first 43 classes in the dataset aims to represent a “synthetic twin” of the well-known [GTSRB](https://ieeexplore.ieee.org/abstract/document/6033395) dataset. **Website**: [synset.de/datasets/synset-signset-ger/](https://synset.de/datasets/synset-signset-ger/) <br> **Paper:** Sielemann, A., Loercher, L., Schumacher, M. L., Wolf, S., Roschani, M., Ziehn, J. and Beyerer, J. (2024). [Synset Signset Germany: a Synthetic Dataset for German Traffic Sign Recognition](https://ieeexplore.ieee.org/abstract/document/10920175). In 2024 IEEE International Conference on Robotics and Automation (ICRA). [[arXiv](https://arxiv.org/abs/2512.05936)] <br> **Authors:** [Anne Sielemann](https://www.linkedin.com/in/anne-sielemann-23011026a/), Lena Lörcher, Max-Lion Schumacher, [Stefan Wolf](https://www.linkedin.com/in/stefan-wolf-2552211a9/), Masoud Roschani, [Jens Ziehn](https://www.linkedin.com/in/jrziehn/), and Juergen Beyerer. [Fraunhofer IOSB](https://www.iosb.fraunhofer.de/) and [Fraunhofer IPA](https://www.ipa.fraunhofer.de/), Germany. <br> **Funded by:** [Fraunhofer](https://www.fraunhofer.de/en.html) Internal Programs under Grant No. PREPARE 40-02702 within the ML4Safety project and the [German Federal Ministry for Economic Affairs and Climate Action](https://www.bundeswirtschaftsministerium.de/Navigation/EN/Home/home.html), within the program “New Vehicle and System Technologies” as part of the [AVEAS](https://aveas.org/) research project. <br> **License:** CC-BY 4.0 <br> ## Description The <em>Synset Signset Germany</em> dataset is a synthetic dataset designed for the task of traffic sign recognition for the country of Germany. It contains a total of 105,500 images of 211 different German traffic sign classes, including newly published (2020) and thus comparatively rare traffic signs. A [subset of 43 classes](https://huggingface.co/datasets/FraunhoferIOSB/Synset-Signset-Germany-GTSRB-Subset) in the dataset aims to represent a “synthetic twin” of the well-known <em>German Traffic Sign Recognition Benchmark</em> (GTSRB), thus it is well-suited for comparing real-world and synthetic data. Thanks to the extensive metadata, it can also be used for applications in the context of explainable AI (XAI) or robustness analyzes and systematic tests. Our data generation approach is based on the Fraunhofer simulation platform [OCTAS®](https://octas.org/) following the general framework of physically-based rendering. It combines the advantages of data-driven and analytical modeling approaches: GAN-based texture generation is employed to produce data-driven dirt and wear artifacts, resulting in unique and realistic traffic sign surfaces. In parallel, analytical scene modulation ensures physically accurate illumination and appropriate geometric transformations, while also enabling fine-grained parameterization of the rendered scenes. For each of the 211 traffic sign classes, the dataset contains 500 RGB images, accumulating to 105,500 independent images in total. All images were rendered by the rasterization-based engine [OGRE](https://ogre3d.org) as well as by the path tracing engine [Cycles](https://www.cycles-renderer.org/). In addition to a sample-wise mask and segmentation image, the dataset also contains extensive metadata, including the stochastically selected environment and imaging effect parameters for each image. ## Citation and Reference <!-- If there is a paper or blog post introducing the dataset, the APA and Bibtex information for that should go in this section. --> To cite this dataset in your scientific work, please use the following bibliography entry: **BibTeX:** @inproceedings{synset_signset_ger_sielemann_2024, title={{Synset Signset Germany: A Synthetic Dataset for German Traffic Sign Recognition}}, author={Sielemann, Anne and Loercher, Lena and Schumacher, Max-Lion and Wolf, Stefan and Roschani, Masoud and Ziehn, Jens and Beyerer, Juergen}, booktitle={2024 IEEE 27th International Conference on Intelligent Transportation Systems (ITSC)}, year={2024} } **APA:** Sielemann, A., Loercher, L., Schumacher, M., Wolf, S., Roschani, M., Ziehn, J., and Beyerer, J. (2024). Synset Signset Germany: A Synthetic Dataset for German Traffic Sign Recognition. In 2024 IEEE 27th International Conference on Intelligent Transportation Systems (ITSC). In case of copying and redistributing, or publishing an adapted version of our dataset, please provide the name of our dataset, the creator names, a copyright notice, a link to this website, a license notice with a link to the license, and, if changes were made, a disclaimer notice, and a short description of the applied changes. For example, as follows: This work is based on the Synset Signset Germany Dataset by Anne Sielemann, Lena Loercher, Max-Lion Schumacher, Stefan Wolf, Jens Ziehn, Masoud Roschani, and Juergen Beyerer, © 2024 Fraunhofer IOSB, All rights reserved. Link: https://synset.de/datasets/synset-signset-ger/ Licence: CC BY 4.0, https://creativecommons.org/licenses/by/4.0/ Disclaimer: The original authors are neither affiliated nor responsible for any applied changes. ## Uses The dataset is designed for the task of traffic sign recognition in Germany. ### Direct Use <!-- This section describes suitable use cases for the dataset. --> The dataset is intended for the following use cases: - Training ML models for the task of German traffic sign recognition - Analyzing the difference between the synthetic dataset and real-world traffic sign recognition datasets, especially the closely related [GTSRB](https://ieeexplore.ieee.org/abstract/document/6033395) dataset. - Testing ML models for the task of traffic sign recognition, in particular by using the detailed meta data of each image. ### Out-of-Scope Use <!-- This section addresses misuse, malicious use, and uses that the dataset will not work well for. --> The dataset should not be used for critical applications, particularly high-risk applications as named by the European AI Act under Annex III (which includes "AI systems intended to be used for the ‘real-time’ and ‘post’ remote biometric identification of natural persons" and "AI systems intended to be used as safety components in the management and operation of road traffic"), without exhaustive research into the fitness of the dataset, to evaluate whether it is "relevant, sufficiently representative, and to the best extent possible free of errors and complete in view of the intended purpose of the system." No such claim is not made with the publication of this dataset. ## Dataset Structure <!-- This section provides a description of the dataset fields, and additional information about the dataset structure such as criteria used to create the splits, relationships between data points, etc. --> The dataset is available in two rendering variants: - Rendering performed by the [OCTAS®](https://octas.org/) API to [OGRE](https://ogre3d.org) engine interface. - Rendering performed by the [OCTAS®](https://octas.org/) API to [Cycles](https://www.cycles-renderer.org/) engine interface. Both variants include 211 traffic sign classes with 500 images each, leading to a total of 105,500 images. Each of the 105,500 renderings contains the raw image (i.e., the simulated camera image), a semantic segmentation image, a mask image, and metadata about the traffic sign status (orientation, upper signs, lower signs, etc.), the environment (daytime, contrast, location, etc.), and the imaging effects (noise level, motion blur strength, aec error, etc.). The dataset provides an exemplary training and validation split. ## Dataset Creation ### Curation Rationale <!-- Motivation for the creation of this dataset. --> The use case of traffic sign recognition has the advantages of, on the one hand, representing a well-understood and established task that provides a wide range of publicly available datasets and applicable models. On the other hand, it remains the subject of active research, in particular, to address challenges such as corner cases and weather conditions, and it has practical relevance, for example, for driver assistance systems, automated driving, and mapping. Since new traffic signs are constantly being released (2020 in Germany) and the coverage of existing signs in publicly available datasets is still limited for a distinction of less common classes, the demand for both training and testing data still persists. The dataset was designed to be comparable to the [GTSRB](https://ieeexplore.ieee.org/abstract/document/6033395) dataset, one of the best known available traffic sign recognition benchmark datasets. Furthermore, it was aimed at providing detailed metadata to enable dataset uses such as XAI analyzes and robustness checks. ### Source Data <!-- This section describes the source data (e.g. news text and headlines, social media posts, translated sentences, ...). --> - The dataset was generated in the [OCTAS®](https://octas.org/) simulation framework, by using rasterization trough the [OGRE](https://ogre3d.org) engine, as well as path tracing through the [Cycles](https://www.cycles-renderer.org/) engine developed by the [Blender Project](https://docs.blender.org/manual/en/latest/render/cycles/index.html). - The traffic sign template images, which are used as input to the GAN-based texture synthesis, stem from the [Wikipedia overview of German traffic signs](https://de.wikipedia.org/wiki/Bildtafel_der_Verkehrszeichen_in_der_Bundesrepublik_Deutschland_seit_2017). - Image-based lighting (IBL) uses 327 environment maps from [PolyHaven](https://polyhaven.com/). - The 3D geometry of the tree that serves as an occlusion object originates from [PolyHaven](https://polyhaven.com/). #### Who are the source data producers? <!-- This section describes the people or systems who originally created the data. It should also include self-reported demographic or identity information for the source data creators if this information is available. --> - [PolyHaven](https://polyhaven.com/), as the provider of the environment maps for image-based lighting (IBL) and the 3D tree object, is an online library for open (CC0) 3D assets provided by different authors. - [Wikipedia](https://de.wikipedia.org/), one of the largest free multilingual open-content encyclopedias, includes the complete list of existing German traffic signs and their template images. ### Annotations <!-- If the dataset contains annotations which are not part of the initial data collection, use this section to describe them. --> #### Annotation process <!-- This section describes the annotation process such as annotation tools used in the process, the amount of data annotated, annotation guidelines provided to the annotators, interannotator statistics, annotation validation, etc. --> The major part of the annotations, including masks, segmentation images, camera parameters and artifacts, and environmental conditions, is based on ground truth data created as part of the scene generation / rendering process. Semantic segmentation images were rendered using the [Ogre](https://www.ogre3d.org/) rendering engine plugin for [OCTAS®](https://octas.org/), which provides rasterization / shading-based image generation. The only manual annotation performed in the creation of the particular dataset is the labeling of permissible upper and lower signs taking the German traffic code / regulation [StVO](https://www.stvo2go.de/verkehrszeichen-wissensnetz/) (Straßenverkehrs-Ordnung) and real-world examples into account. #### Who are the annotators? <!-- This section describes the people or systems who created the annotations. --> The annotation of the permissible upper and lower signs was performed by the authors. #### Personal and Sensitive Information <!-- State whether the dataset contains data that might be considered personal, sensitive, or private (e.g., data that reveals addresses, uniquely identifiable names or aliases, racial or ethnic origins, sexual orientations, religious beliefs, political opinions, financial or health data, etc.). If efforts were made to anonymize the data, describe the anonymization process. --> The dataset contains no data that might be considered personal, sensitive, or private. ## Bias, Risks, and Limitations <!-- This section is meant to convey both technical and sociotechnical limitations. --> - **Traffic Signs:** The wear and tear generation is limited to artifacts such as color fading, scratches, screw holes, and sticker residues. Complex stickers, graffiti, or dirt are not included. Retroreflector patterns are excluded, and retroreflection is not simulated. The traffic signs are solely mounted on metallic traffic sign poles. - **Environment:** Environmental variation includes no adverse weather conditions (snow, raindrops, fog, ...). - **Occlusions:** All included occlusions or shadows stem from a single 3D tree geometry. - **Camera:** Only one set of intrinsic camera parameters is used, and only a single camera lens type (based on a Tamron M112FM35 35 mm lens) is simulated. It can be assumed that the set of simulated imaging artifacts is not complete. ### Recommendations <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. --> <!-- Users should be made aware of the risks, biases and limitations of the dataset. More information needed for further recommendations. --> It is recommended to use the dataset primarily for scientific research. Application to practical real-world use cases should include human oversight and exhaustive evaluation of the fitness for the respective purpose, including the impact of domain shifts. <!-- ## Glossary [optional] If relevant, include terms and calculations in this section that can help readers understand the dataset or dataset card. [More Information Needed] ## More Information [optional] [More Information Needed]--> ## Dataset Card Contact Anne Sielemann\ Fraunhofer IOSB\ Group »Automotive and Simulation«\ Fraunhoferstr. | 76131 Karlsruhe | Germany\ anne.sielemann@iosb.fraunhofer.de\ [www.iosb.fraunhofer.de](https://www.iosb.fraunhofer.de) Jens Ziehn\ Fraunhofer IOSB\ Group leader »Automotive and Simulation«\ Fraunhoferstr. | 76131 Karlsruhe | Germany\ Phone +49 721 6091 – 633\ jens.ziehn@iosb.fraunhofer.de\ [www.iosb.fraunhofer.de](https://www.iosb.fraunhofer.de)
提供机构:
FraunhoferIOSB
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作