LLM安全评测数据

Name: LLM安全评测数据
Creator: maas
Published: 2026-05-20 13:47:41
License: 暂无描述

魔搭社区2026-05-20 更新2026-05-17 收录

下载链接：

https://modelscope.cn/datasets/ZJUICSR/ZJU-SafeEval

下载链接

链接失效反馈

官方服务：

资源简介：

本数据集面向**大语言模型安全与合规评测**设计，以中国国家标准《网络安全技术生成式人工智能服务安全基本要求》（TC260）中的分类体系为基础，融合论文《RedAgent: Red Teaming Large Language Models with Context-aware Autonomous Language Agent》中提出的**上下文感知红队越狱攻击方法**，对原始合规评测问题进行系统性改造，生成更具挑战性和现实威胁模拟能力的越狱提示。

This dataset is designed for **safety and compliance evaluation of large language models (LLMs)**. It is developed based on the classification framework specified in the Chinese National Standard "Cybersecurity Technology – Basic Requirements for the Safety of Generative Artificial Intelligence Services" (TC260), and incorporates the **context-aware red teaming jailbreak attack methodology** proposed in the paper "RedAgent: Red Teaming Large Language Models with Context-aware Autonomous Language Agent" to systematically restructure the original compliance evaluation questions, thus generating jailbreak prompts with elevated challenge and realistic threat simulation capabilities.

提供机构：

maas

创建时间：

2026-04-28

搜集汇总

数据集介绍

背景与挑战

背景概述

该数据集专为大语言模型（LLM）的安全与合规评估设计，基于中国国家标准和RedAgent上下文感知红队攻击方法，将原始评估提示转化为更具挑战性的越狱提示。它包含100多个文本样本，覆盖5大类31子类敏感指令，用于LLM内在安全评估，存储格式为csv/parquet，严格限于安全研究和防御目的。

以上内容由遇见数据集搜集并总结生成