aisi-whitebox/sec_qa_v1_prompted-sandbagging-llama_33_70b_instruct

Name: aisi-whitebox/sec_qa_v1_prompted-sandbagging-llama_33_70b_instruct
Creator: aisi-whitebox
Published: 2025-04-24 18:34:15
License: 暂无描述

Hugging Face2025-04-24 更新2025-04-26 收录

下载链接：

https://hf-mirror.com/datasets/aisi-whitebox/sec_qa_v1_prompted-sandbagging-llama_33_70b_instruct

下载链接

链接失效反馈

官方服务：

资源简介：

sec_qa_v1_prompted-sandbagging-llama_33_70b_instruct 数据集是一个用于欺骗性检测的任务数据集，基于sec_qa_v1任务创建。该数据集使用了vllm/meta-llama/Llama-3.3-70B-Instruct模型，并且启用了Sandbagging检测功能，但没有应用过滤。数据集的创建日期为2025年4月24日。数据集划分未应用，但指定了测试集大小为20%，验证集大小为50%，随机种子为42。统计信息显示，对于deception_sprint/sec_qa_v1任务，正常准确度为99.09%，Sandbagging准确度为0.91%，大部分样本对是Sandbagging（C→I）。

The sec_qa_v1_prompted-sandbagging-llama_33_70b_instruct dataset is a task dataset for deception detection based on the sec_qa_v1 task. This dataset uses the vllm/meta-llama/Llama-3.3-70B-Instruct model, and Sandbagging detection is enabled without filtering applied. The dataset was created on April 24, 2025. No split was applied to the dataset, but the test set size is specified as 20%, the validation set size as 50%, and the random seed as 42. The statistics show that for the deception_sprint/sec_qa_v1 task, the normal accuracy is 99.09%, the Sandbagging accuracy is 0.91%, and most of the sample pairs are Sandbagging (C→I).

提供机构：

aisi-whitebox

5,000+

优质数据集

54 个

任务类型

进入经典数据集