aisi-whitebox/wmdp_chem_cot_prompted_sandbagging_llama_31_8b_instruct
收藏Hugging Face2025-04-23 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/aisi-whitebox/wmdp_chem_cot_prompted_sandbagging_llama_31_8b_instruct
下载链接
链接失效反馈官方服务:
资源简介:
inspect llama 31 8b instruct prompted sandbagging wmdp chem cot数据集是一个用于评估AI模型在化学相关欺骗任务上的表现的数据集。它使用Llama-3.1-8B-Instruct模型,并且包含了故意设置的低性能AI提示,用于模拟一个能力较弱的AI。数据集启用了沙袋策略检测,但没有应用过滤。数据集的划分包括测试集和验证集,并且提供了相关的准确性统计信息。
The inspect llama 31 8b instruct prompted sandbagging wmdp chem cot dataset is designed for evaluating the performance of AI models on chemistry-related deception tasks. It uses the Llama-3.1-8B-Instruct model and includes prompts intentionally set to mimic a less capable AI for simulating underperformance. Sandbagging detection is enabled in the dataset without filtering applied. The dataset is split into test and validation sets, and it provides accuracy statistics related to the tasks.
提供机构:
aisi-whitebox



