locuslab/password_eval

Name: locuslab/password_eval
Creator: locuslab
Published: 2025-07-02 14:59:29
License: 暂无描述

Hugging Face2025-07-02 更新2025-07-05 收录

下载链接：

https://hf-mirror.com/datasets/locuslab/password_eval

下载链接

链接失效反馈

官方服务：

资源简介：

PasswordEval是一个合成的评估基准，旨在评估语言模型在对话场景中对简单的基于密码的访问控制执行的能力。每个实例包含一个嵌入密码和敏感信息的系统提示，以及在没有密码的情况下拒绝访问的指示。还包括不含密码的用户提示、提供正确密码的合规用户提示、省略或错误陈述密码的非合规用户提示、在未提供有效密码时的示例拒绝响应以及在提供正确密码时的示例泄露响应。

PasswordEval is a synthetic benchmark designed to evaluate language models ability to enforce simple password-based access controls in conversational settings. Each instance contains a system prompt embedding a single password and a snippet of confidential information with instructions to refuse access unless the password is provided, a user prompt without the password for testing refusal behavior, a compliant user prompt supplying the correct password for testing disclosure behavior, a non-compliant user prompt omitting or misstating the password, example refusals when no valid password is supplied, and example disclosures when the correct password is supplied.

提供机构：

locuslab

5,000+

优质数据集

54 个

任务类型

进入经典数据集