shannifnju/Aegis-AI-Content-Safety-Dataset-2.0
收藏Hugging Face2026-04-24 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/shannifnju/Aegis-AI-Content-Safety-Dataset-2.0
下载链接
链接失效反馈官方服务:
资源简介:
Nemotron内容安全数据集V2,前身为Aegis AI内容安全数据集2.0,包含33,416条人类与大型语言模型(LLM)之间的标注交互数据,分为30,007条训练样本、1,445条验证样本和1,964条测试样本。该数据集扩展了之前发布的Nemotron内容安全数据集V1,采用混合数据生成流程,结合了人类在完整对话级别的标注和多LLM“陪审团”系统来评估回复的安全性。数据集涵盖了12个顶级危险类别和9个细粒度子类别,用于内容安全分类。数据来源包括人类编写的提示、LLM生成的回复以及通过多LLM系统生成的回复安全标签。数据集主要用于构建LLM内容安全审核护栏和通用LLM的安全对齐研究。
The Nemotron Content Safety Dataset V2, formerly known as Aegis AI Content Safety Dataset 2.0, is comprised of 33,416 annotated interactions between humans and LLMs, split into 30,007 training samples, 1,445 validation samples, and 1,964 test samples. This release is an extension of the previously published Nemotron Content Safety Dataset V1. The dataset leverages a hybrid data generation pipeline that combines human annotations at full dialog level with a multi-LLM jury system to assess the safety of responses. It adheres to a comprehensive taxonomy for categorizing safety risks, structured into 12 top-level hazard categories with 9 fine-grained subcategories. The dataset is curated using human-written prompts, LLM-generated responses, and safety labels generated by an ensemble of LLMs. It is intended for building LLM content safety moderation guardrails and aligning general-purpose LLMs towards safety.
提供机构:
shannifnju



