harmbench
收藏魔搭社区2025-12-05 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/swiss-ai/harmbench
下载链接
链接失效反馈官方服务:
资源简介:
# HarmBench
- Original repo: https://github.com/centerforaisafety/HarmBench/tree/main/data/behavior_datasets
This HF dataset contains two types of harmful prompts:
- DirectRequest: taken from [harmbench_behaviors_text_test.csv](https://github.com/centerforaisafety/HarmBench/blob/main/data/behavior_datasets/harmbench_behaviors_text_test.csv) (test split) and [harmbench_behaviors_text_val.csv](https://github.com/centerforaisafety/HarmBench/blob/main/data/behavior_datasets/harmbench_behaviors_text_val.csv) (val split)
- HumanJailbreaks: manual selection of jailbreaks divided into test and val splits
- Moreover, `metadata.csv` contains the information about each behavior and can be mapped to the prompts above by BehaviourID ([original_file](https://github.com/centerforaisafety/HarmBench/blob/main/data/behavior_datasets/harmbench_behaviors_text_all.csv))
How to load the data:
```
load_dataset("swiss-ai/harmbench", "DirectRequest", split="test")
load_dataset("swiss-ai/harmbench", "HumanJailbreaks", split="test")
```
# HarmBench
- 原始代码仓库:https://github.com/centerforaisafety/HarmBench/tree/main/data/behavior_datasets
该 Hugging Face(HF)数据集包含两类有害提示词:
- 直接请求(DirectRequest):数据来源于[harmbench_behaviors_text_test.csv](https://github.com/centerforaisafety/HarmBench/blob/main/data/behavior_datasets/harmbench_behaviors_text_test.csv)(测试子集)与[harmbench_behaviors_text_val.csv](https://github.com/centerforaisafety/HarmBench/blob/main/data/behavior_datasets/harmbench_behaviors_text_val.csv)(验证子集)
- 人工越狱提示(HumanJailbreaks):经人工筛选的越狱类提示,已划分为测试子集与验证子集
此外,`metadata.csv` 包含各行为的元数据信息,可通过行为ID(BehaviourID)与上述提示词进行关联,其原始完整数据文件为[harmbench_behaviors_text_all.csv](https://github.com/centerforaisafety/HarmBench/blob/main/data/behavior_datasets/harmbench_behaviors_text_all.csv)
数据加载方式如下:
load_dataset("swiss-ai/harmbench", "DirectRequest", split="test")
load_dataset("swiss-ai/harmbench", "HumanJailbreaks", split="test")
提供机构:
maas
创建时间:
2025-11-25



