aisi-whitebox/arc_challenge_prompted_sandbagging_llama_31_8b_instruct
收藏Hugging Face2025-04-09 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/aisi-whitebox/arc_challenge_prompted_sandbagging_llama_31_8b_instruct
下载链接
链接失效反馈官方服务:
资源简介:
inspect llama 31 8b instruct prompted sandbagging arc challenge数据集是一个专门用于评估AI模型在ARC挑战任务中表现的检测数据集,包含故意给出差的解决方案的sandbagging情况。数据集基于vllm/meta-llama/Llama-3.1-8B-Instruct模型创建,并提供了两种不同的系统提示,用于评估模型在欺骗性任务中的表现。
The inspect llama 31 8b instruct prompted sandbagging arc challenge dataset is designed for evaluating the performance of AI models on the ARC challenge task, including scenarios with intentional suboptimal solutions (sandbagging). Based on the vllm/meta-llama/Llama-3.1-8B-Instruct model, this dataset provides two types of system prompts to assess the models performance in deception tasks.
提供机构:
aisi-whitebox



