tanaos/synthetic-guardrail-dataset-german

Name: tanaos/synthetic-guardrail-dataset-german
Creator: tanaos
Published: 2026-02-10 07:47:23
License: 暂无描述

Hugging Face2026-02-10 更新2026-04-05 收录

下载链接：

https://hf-mirror.com/datasets/tanaos/synthetic-guardrail-dataset-german

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - de license: mit tags: - moderation - guardrail - text-classification - toxicity-detection - llm-safety - content-safety - alignment - ethical-ai - synthetic-data - tanaos pretty_name: tanaos-guardrail-german Training Dataset task_categories: - text-classification task_ids: - hate-speech-detection - sentiment-classification size_categories: - 10K<n<20K --- <p align="center"> <img src="https://raw.githubusercontent.com/tanaos/.github/master/assets/logo.png" width="250px" alt="Tanaos – Train task specific LLMs without training data, for offline NLP and Text Classification"> </p> # Tanaos Guardrail German Training Dataset This dataset was created synthetically by Tanaos with the [Artifex](https://github.com/tanaos/artifex) Python library. The dataset is designed to **train and evaluate guardrail systems** — models that detect, classify, or filter unsafe, harmful or potentially dangerous content — in German. It can be used to **train moderation models** or integrate **LLM safety filters** for applications like chatbots, content generation, and user-facing AI systems. Our german guardrail model, [tanaos-guardrail-german](https://huggingface.co/tanaos/tanaos-guardrail-german), was trained on this dataset. ## Dataset Summary The dataset contains text samples associated with arrays of 14 binary labels each. A label of `1` indicates that the text falls into the corresponding unsafe category, while a label of `0` indicates safety. The categories are the following: | Label Index | Category | Description | |-------------|---------------------------|--------------------------------------------------------| | 0 | violence | Content describing or encouraging violent acts | | 1 | non_violent_unethical | Content containing hateful or discriminatory language | | 2 | financial_crime | Content related to financial fraud or scams | | 3 | discrimination | Content promoting discrimination against individuals or groups | | 4 | drug_weapons | Content related to illegal drugs or weapons | | 5 | self_harm | Content encouraging self-harm or suicide | | 7 | privacy | Content that invades personal privacy or shares private information | | 8 | sexual_content | Content that is sexually explicit or inappropriate | | 9 | child_abuse | Content involving the exploitation or abuse of children | | 10 | terrorism_organized_crime | Content related to terrorism or organized crime | | 11 | hacking | Content related to unauthorized computer access or cyberattacks | | 12 | animal_abuse | Content involving the abuse or mistreatment of animals | | 13 | jailbreak_prompt_inj | Content attempting to bypass or manipulate system instructions or safeguards | For instance, the following label: `[0 0 0 0 0 0 0 0 0 0 0 0 0 0]` means the the corresponding text is safe; the following label: `[0 1 0 0 0 0 0 0 0 0 0 0 0 1]` means that the text is unsafe, due to thge presence of `non_violent_unethical` as well as `jailbreak_prompt_inj` content. ## How to Use ```python from datasets import load_dataset dataset = load_dataset("tanaos/synthetic-guardrail-dataset-german") print(dataset["train"][0]) ``` ## Intended Use This dataset is meant for **training, fine-tuning, and evaluating** models that act as **guardrails** for AI systems, if the content is in German. Common use cases: - Detecting and filtering toxic or policy-violating user input - Reinforcing LLMs with content safety constraints - Improving safety layers in production AI assistants or chatbots

提供机构：

tanaos

5,000+

优质数据集

54 个

任务类型

进入经典数据集