tanaos/synthetic-guardrail-dataset-v2
收藏Hugging Face2026-02-03 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/tanaos/synthetic-guardrail-dataset-v2
下载链接
链接失效反馈官方服务:
资源简介:
该数据集由Tanaos使用Artifex Python库合成,旨在训练和评估护栏系统——检测、分类或过滤不安全、有害或潜在危险内容的模型。可用于训练审核模型或为聊天机器人、内容生成和面向用户的AI系统等应用集成LLM安全过滤器。数据集包含文本样本和14个二进制标签,每个标签对应不同的不安全内容类别,如暴力、非暴力不道德行为、金融犯罪、歧视、非法药物或武器、自残、隐私侵犯、性内容、儿童虐待、恐怖主义或有组织犯罪、黑客攻击、虐待动物以及越狱提示或指令注入。
This dataset was created synthetically by Tanaos with the Artifex Python library. The dataset is designed to train and evaluate guardrail systems — models that detect, classify, or filter unsafe, harmful or potentially dangerous content. It can be used to train moderation models or integrate LLM safety filters for applications like chatbots, content generation, and user-facing AI systems. The dataset contains text samples associated with arrays of 14 binary labels each, corresponding to different unsafe content categories such as violence, non-violent unethical behavior, financial crime, discrimination, illegal drugs or weapons, self-harm, privacy invasion, sexual content, child abuse, terrorism or organized crime, hacking, animal abuse, and jailbreak prompt or instruction injection.
提供机构:
tanaos



