five

AlexSham/Toxic_Russian_Comments

收藏
Hugging Face2024-03-24 更新2024-06-22 收录
下载链接:
https://hf-mirror.com/datasets/AlexSham/Toxic_Russian_Comments
下载链接
链接失效反馈
官方服务:
资源简介:
--- task_categories: - text-classification language: - ru tags: - NLP - Toxic - Russian - classification - binary classification pretty_name: Toxic russian comments from ok.ru --- https://www.kaggle.com/datasets/alexandersemiletov/toxic-russian-comments 0 - neutral user comments 1 - toxic user comments ----------------------- Toxic Russian Comments Dataset This dataset contains labelled comments from the popular Russian social network ok.ru. The data was used in a competition where participants had to automatically label each comment with at least one of the four predefined classes. The classes represent different levels of toxicity. The competition was held on the All Cups platform. Each comment belongs to one of the following classes, with each label complying with the fastText formatting rules: __label__NORMAL - neutral user comments __label__INSULT - comments that humiliate a person __label__THREAT - comments with an explicit intent to harm another person __label__OBSCENITY - comments that contain a description or a threat of a sexual assault
提供机构:
AlexSham
原始信息汇总

数据集概述

基本信息

  • 任务类别: 文本分类
  • 语言: 俄语
  • 标签: NLP, Toxic, Russian, classification, binary classification
  • 名称: Toxic russian comments from ok.ru

数据内容

  • 数据来源: 俄罗斯社交网络 ok.ru
  • 数据类型: 用户评论
  • 标签类别:
    • 0 - 中性用户评论
    • 1 - 有毒用户评论

详细描述

  • 竞赛用途: 该数据集曾用于竞赛,参赛者需自动标记每条评论至少属于四个预定义类别之一。
  • 类别定义:
    • __label__NORMAL - 中性用户评论
    • __label__INSULT - 侮辱性评论
    • __label__THREAT - 威胁性评论
    • __label__OBSCENITY - 包含性侵犯描述或威胁的评论
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作