MatanBT/gcg-evaluated-data
收藏Hugging Face2025-06-18 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/MatanBT/gcg-evaluated-data
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了基于Gemma-2、Qwen-2.5和Llama-3.1模型在有害指令后添加GCG后缀生成的响应。数据集的每一行详细记录了有害指令信息、对抗性后缀信息、响应及其评估、在预填充情况下的响应及其评估、后缀特定的评估以及后缀的优化信息。该数据集专为研究目的设计,包含可能的有害内容。
This dataset includes responses generated by appending GCG suffixes to harmful instructions based on the Gemma-2, Qwen-2.5, and Llama-3.1 models. Each row in the dataset provides detailed information on harmful instruction info, adversarial suffix info, response and its evaluation, response under prefilling and its evaluation, suffix-specific evaluation, and suffix optimization info. The dataset is designed for research purposes and contains potentially harmful content.
提供机构:
MatanBT



