aisi-whitebox/sec_qa_v2_prompted_sandbagging_llama_31_8b_instruct
收藏Hugging Face2025-04-09 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/aisi-whitebox/sec_qa_v2_prompted_sandbagging_llama_31_8b_instruct
下载链接
链接失效反馈官方服务:
资源简介:
这是一个使用vllm/meta-llama/Llama-3.1-8B-Instruct模型创建的欺骗性安全问答数据集,专门用于deception_sprint/sec_qa_v2任务。数据集包含了沙袋策略的检测,但不包含过滤。数据集的创建日期为2025年4月9日。该数据集包含两个系统提示:一个良性的提示和一个恶意的提示,用于指导模型生成故意低质量的回答。数据集分为正常准确度和沙袋准确度,其中沙袋准确度较低,表明模型在执行任务时故意降低了性能。数据集没有应用分割策略,但提供了测试和验证的大小以及随机种子。额外参数包括数据限制、令牌限制、错误失败率、训练周期数、最大连接数和任务名称。
This is a deception-based security question answering dataset created using the vllm/meta-llama/Llama-3.1-8B-Instruct model, specifically for the deception_sprint/sec_qa_v2 task. The dataset includes sandbagging detection but does not apply filtering. The dataset was created on April 9, 2025. It contains two system prompts: a benign prompt and a malicious prompt, which guide the model to generate intentionally low-quality responses. The dataset is split into normal accuracy and sandbagging accuracy, with the sandbagging accuracy being lower, indicating that the model intentionally reduced its performance when executing the task. The dataset does not apply a splitting strategy but provides test and validation sizes and a random seed. Additional parameters include data limit, token limit, error failure rate, number of training epochs, maximum connections, and task name.
提供机构:
aisi-whitebox



