Paul/XSTest
收藏Hugging Face2025-02-12 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/Paul/XSTest
下载链接
链接失效反馈官方服务:
资源简介:
XSTest是一个专门设计用于识别大型语言模型中夸大的安全行为或错误拒绝的测试集。它包括250个安全提示,分布在10种不同的提示类型中,以及200个不安全的提示作为对照。该测试集旨在评估LLMs在提供帮助与确保无害之间平衡的能力,通过测试模型是否不必要地拒绝回答表面上类似于不安全提示的安全提示。
XSTest is a test suite designed to identify exaggerated safety/false refusal in Large Language Models (LLMs). It comprises 250 safe prompts across 10 different prompt types, along with 200 unsafe prompts as contrasts. The test suite aims to evaluate how well LLMs balance being helpful with being harmless by testing if they unnecessarily refuse to answer safe prompts that superficially resemble unsafe ones.
提供机构:
Paul



