CHiSafetyBench

Name: CHiSafetyBench
Creator: UnicomAI
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

https://github.com/UnicomAI/UnicomBenchmark/tree/main/CHiSafetyBench

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是一个用于评估大型语言模型在中文学术环境中识别风险内容并拒绝回答风险问题的安全性能基准。它涵盖了一个分层的中文安全分类体系，该体系包含5个风险领域和31个类别。该数据集包含了多项选择题和问答任务，旨在衡量大型语言模型在中文环境下的安全处理能力。数据集规模为2323个基准样本，任务重点在于风险内容的识别以及对于风险问题的拒绝回答。

This dataset is a safety performance benchmark for evaluating large language models' capacity to identify risky content and decline to answer risky questions within the Chinese academic context. It encompasses a hierarchical Chinese safety classification system, which comprises 5 risk domains and 31 categories. The dataset contains multiple-choice questions and question-answering tasks, aiming to measure the safety handling capabilities of large language models in the Chinese context. The dataset consists of 2,323 benchmark samples, with the tasks focusing on the identification of risky content and the refusal to respond to risky questions.

提供机构：

UnicomAI

搜集汇总

数据集介绍

背景与挑战

背景概述

CHiSafetyBench是一个专门用于评估大型语言模型在中文环境下安全性能的基准测试数据集，包含多项选择题和问答任务两种类型，分别测试模型的风险内容识别能力和拒绝回答能力。数据集还创新性地引入了带有对话历史的风险问题，以更真实地模拟实际交互场景。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集