PKU-BeaverTails

arXiv2025-09-30 收录

下载链接：

https://huggingface.co/datasets/PKU-Alignment/BeaverTails-Evaluation

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集涵盖了可能导致有害内容的广泛敏感主题，并包含了700个评估提示，这些提示被标注在单一类别下。其规模为700个提示，任务旨在评估语言模型的安全性能。

This dataset covers a wide range of sensitive topics that may lead to harmful content, and includes 700 evaluation prompts, all of which are labeled under a single category. With a total of 700 prompts, this dataset is designed to evaluate the safety performance of language models.

提供机构：

PKU

搜集汇总

数据集介绍

背景与挑战

背景概述

PKU-BeaverTails是一个AI安全数据集，包含700条测试提示，覆盖14种危害类别，旨在评估语言模型的安全性并促进减少AI系统危害的研究。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集