JailbreakEval
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/ThuCCSLab/JailbreakEval
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个综合工具包,其中包含了多种用于评估针对大型语言模型(LLMs)的越狱攻击的主流安全评估器。此外,该工具包还支持基于投票的安全评估方式,通过多个安全评估器生成最终的判断结果。其任务是评估越狱攻击。
This dataset constitutes a comprehensive toolkit that integrates multiple mainstream security evaluators designed to assess jailbreak attacks against large language models (LLMs). Furthermore, this toolkit supports voting-based security evaluation approaches, which produce final judgment results by aggregating the outputs of multiple security evaluators. The primary task of this toolkit is to evaluate jailbreak attacks.
提供机构:
ThuCCSLab



