five

tcapelle/kaggle-toxic-annotated

收藏
Hugging Face2024-11-30 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/tcapelle/kaggle-toxic-annotated
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集是Kaggle毒性数据集的GPT-4o-mini注释版本,使用了与Toxic-Commons [Celadon]相同的提示进行注释。数据集包含评论文本和多个与毒性相关的标签,如toxic、severe_toxic、obscene、threat、insult和identity_hate。此外,数据集还包含一个名为toxic_commons_label的结构化字段,该字段包含多个与歧视和暴力相关的子字段及其评分,如能力歧视、性别歧视、种族歧视和宗教歧视等。数据集分为训练集和测试集,分别包含159570和153163个样本。数据集的总下载大小为159301273字节,总大小为370369620字节。

This is a dataset containing comment text and its associated labels, including whether it contains toxicity, severe toxicity, obscenity, threats, insults, and identity hate. Additionally, the dataset includes a structured label for describing the reasoning and scoring of different types of discrimination and violent behaviors. The dataset is divided into training and test sets, containing 159570 and 153163 samples respectively. This dataset is annotated using the gpt-4o-mini model with the same prompt used for Toxic-Commons.
提供机构:
tcapelle
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作