five

aisingapore/nlu-toxicity_detection

收藏
Hugging Face2024-12-20 更新2024-12-21 收录
下载链接:
https://hf-mirror.com/datasets/aisingapore/nlu-toxicity_detection
下载链接
链接失效反馈
官方服务:
资源简介:
SEA Toxicity Detection数据集用于评估模型在识别文本中的有毒内容(如仇恨言论和侮辱性语言)方面的能力。数据集涵盖了印尼语、泰语和越南语,并按语言划分,包含少量示例的分割。数据来源包括MLHSD(印尼语)、TTD(泰语)和ViHSD(越南语)。数据集的设计旨在评估聊天或指令调优的大型语言模型(LLMs),并作为SEA-HELM排行榜的一部分。

The SEA Toxicity Detection dataset is designed to evaluate a models ability to identify toxic content such as hate speech and abusive language in text. It is sampled from MLHSD for Indonesian, TTD for Thai, and ViHSD for Vietnamese. The dataset supports tasks including text generation and text classification, primarily for evaluating chat or instruction-tuned large language models (LLMs). The dataset is split by language and includes additional fewshot example splits. Statistics for the dataset include the number of examples per language and token counts for different models. The sources and license information for the dataset are also detailed in the document.
提供机构:
aisingapore
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作