PolygloToxicityPrompts

Name: PolygloToxicityPrompts
Creator: Hugging Face
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

https://huggingface.co/datasets/ToxicityPrompts/PolygloToxicityPrompts

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是一个涵盖17种语言的综合多语言基准测试，提供了按四个毒性级别分类的文本样本。此外，该数据集还包含了独特的触发短语，用于评估语言模型中的跨语言后门攻击。其规模宏大，作为一个综合多语言基准，其主要任务是对文本进行毒性分类。

This dataset is a comprehensive multilingual benchmark covering 17 languages, providing text samples classified into four toxicity levels. Furthermore, it includes unique trigger phrases designed to evaluate cross-lingual backdoor attacks against language models. Boasting a considerable scale, the core task of this benchmark is text toxicity classification.

提供机构：

Hugging Face

5,000+

优质数据集

54 个

任务类型

进入经典数据集