aurora-m/biden-harris-redteam-old

Name: aurora-m/biden-harris-redteam-old
Creator: aurora-m
Published: 2025-10-12 01:56:34
License: 暂无描述

Hugging Face2025-10-12 更新2025-10-18 收录

下载链接：

https://hf-mirror.com/datasets/aurora-m/biden-harris-redteam-old

下载链接

链接失效反馈

官方服务：

资源简介：

这是一个专注于拜登-哈里斯AI行政命令的红队数据集，包含指令-响应对，用于训练大型语言模型（LLM）以防止生成可能造成伤害的内容。数据集中的指令来自于过滤人类偏好数据集以及半自动模板化方法，响应则由GPT-4初步起草并由Aurora-m模型重写和扩展，并经过人工编辑以提供拒绝性回应和解释。数据集涵盖了自我伤害、网络攻击、非法行为、隐私侵犯、仇恨言论等多个领域。

This is a red-teaming dataset focusing on the Biden-Harris AI Executive Order, consisting of instruction-response pairs for training Large Language Models (LLMs) to prevent the generation of harmful content. The instructions are derived from filtering the human preference dataset and semi-automatic template-based methods, while the responses are initially drafted by GPT-4 and then rewritten and expanded by the Aurora-m model, followed by manual editing to provide refusals with explanations. The dataset covers various areas including self-harm, cyber-attacks, illegal acts, privacy infringement, hate speech, and more.

提供机构：

aurora-m

5,000+

优质数据集

54 个

任务类型

进入经典数据集