AlexSham/Toxic_Russian_Comments
收藏Hugging Face2024-03-24 更新2024-06-22 收录
下载链接:
https://hf-mirror.com/datasets/AlexSham/Toxic_Russian_Comments
下载链接
链接失效反馈官方服务:
资源简介:
---
task_categories:
- text-classification
language:
- ru
tags:
- NLP
- Toxic
- Russian
- classification
- binary classification
pretty_name: Toxic russian comments from ok.ru
---
https://www.kaggle.com/datasets/alexandersemiletov/toxic-russian-comments
0 - neutral user comments
1 - toxic user comments
-----------------------
Toxic Russian Comments Dataset
This dataset contains labelled comments from the popular Russian social network ok.ru.
The data was used in a competition where participants had to automatically label each comment with at least one of the four predefined classes. The classes represent different levels of toxicity. The competition was held on the All Cups platform.
Each comment belongs to one of the following classes, with each label complying with the fastText formatting rules:
__label__NORMAL - neutral user comments
__label__INSULT - comments that humiliate a person
__label__THREAT - comments with an explicit intent to harm another person
__label__OBSCENITY - comments that contain a description or a threat of a sexual assault
提供机构:
AlexSham
原始信息汇总
数据集概述
基本信息
- 任务类别: 文本分类
- 语言: 俄语
- 标签: NLP, Toxic, Russian, classification, binary classification
- 名称: Toxic russian comments from ok.ru
数据内容
- 数据来源: 俄罗斯社交网络 ok.ru
- 数据类型: 用户评论
- 标签类别:
0- 中性用户评论1- 有毒用户评论
详细描述
- 竞赛用途: 该数据集曾用于竞赛,参赛者需自动标记每条评论至少属于四个预定义类别之一。
- 类别定义:
__label__NORMAL- 中性用户评论__label__INSULT- 侮辱性评论__label__THREAT- 威胁性评论__label__OBSCENITY- 包含性侵犯描述或威胁的评论



