walledai/WildGuardTest

Name: walledai/WildGuardTest
Creator: walledai
Published: 2024-07-03 12:24:59
License: 暂无描述

Hugging Face2024-07-03 更新2024-07-06 收录

下载链接：

https://hf-mirror.com/datasets/walledai/WildGuardTest

下载链接

链接失效反馈

官方服务：

资源简介：

WildGuardMix数据集包含1,725个条目，用于提示危害、响应危害和响应拒绝分类任务。数据类型包括普通和对抗性合成数据，以及用户与LLM的交互数据。标签由三名独立注释者标注，Fleiss Kappa评分显示中等到高度的一致性。标签质量通过GPT-4分类器和人工检查进一步验证。数据集还包含可能令人不安、有害或令人沮丧的示例，涉及歧视性语言、虐待、暴力、自残、性内容、错误信息等高风险类别。

The WildGuardMix dataset contains 1,725 items for prompt harm, response harm, and response refusal classification tasks. The data types include vanilla and adversarial synthetic data, as well as in-the-wild user-LLM interactions. The labels are annotated by three independent annotators, with Fleiss Kappa scores indicating moderate to substantial agreement. The label quality is further validated using a prompted GPT-4 classifier and manual inspection. The dataset includes examples that might be disturbing, harmful, or upsetting, such as discriminatory language, discussions about abuse, violence, self-harm, sexual content, misinformation, and other high-risk categories.

提供机构：

walledai

原始信息汇总

WildGuardMix 数据集概述

数据集信息

特征

prompt: 字符串类型
adversarial: 布尔类型
label: 字符串类型

数据分割

train: 包含1725个样本，占用856863字节

文件大小

下载大小: 490550字节
数据集大小: 856863字节

配置

default: 包含训练数据文件路径 data/train-*

许可证

odc-by

任务类别

text-classification

语言

数据集规模

10K<n<100K

数据集概述

数据规模

数据量: 1,725个样本，用于提示危害、响应危害和响应拒绝分类任务

数据类型

普通和对抗性合成数据，以及真实用户与语言模型交互的数据

免责声明

数据包含可能令人不安、有害或令人不快的内容，涵盖歧视性语言、虐待、暴力、自残、性内容、错误信息和其他高风险类别。建议不要仅使用有害示例训练语言模型。

5,000+

优质数据集

54 个

任务类型

进入经典数据集