Harmful Behavior Problems (HBP)
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/llm-attacks/llm-attacks/blob/main/data/advbench/harmful_behaviors.csv
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了520个恶意问题,这些问题是商业大型语言模型禁止用户查询的,其中包括请求协助犯罪活动或网络攻击的内容。该数据集用于评估IntentObfuscator方法的有效性,并包含了各种有害指令。其规模为520个恶意查询,任务是对语言模型上的越狱攻击方法进行验证。
This dataset includes 520 malicious queries that are prohibited from being submitted by users on commercial large language models (LLMs), covering requests for assistance in criminal activities or cyberattacks. It is employed to evaluate the effectiveness of the IntentObfuscator method and encompasses various harmful instructions. This dataset, which has a total of 520 malicious queries, is designed to validate jailbreaking attack methods on language models.
提供机构:
GitHub



