tasksource/context_toxicity

Name: tasksource/context_toxicity
Creator: tasksource
Published: 2023-07-02 12:58:26
License: 暂无描述

Hugging Face2023-07-02 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/tasksource/context_toxicity

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 --- https://github.com/ipavlopoulos/context_toxicity/ ``` @inproceedings{xenos-etal-2021-context, title = "Context Sensitivity Estimation in Toxicity Detection", author = "Xenos, Alexandros and Pavlopoulos, John and Androutsopoulos, Ion", booktitle = "Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)", month = aug, year = "2021", address = "Online", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2021.woah-1.15", doi = "10.18653/v1/2021.woah-1.15", pages = "140--145", abstract = "User posts whose perceived toxicity depends on the conversational context are rare in current toxicity detection datasets. Hence, toxicity detectors trained on current datasets will also disregard context, making the detection of context-sensitive toxicity a lot harder when it occurs. We constructed and publicly release a dataset of 10k posts with two kinds of toxicity labels per post, obtained from annotators who considered (i) both the current post and the previous one as context, or (ii) only the current post. We introduce a new task, context-sensitivity estimation, which aims to identify posts whose perceived toxicity changes if the context (previous post) is also considered. Using the new dataset, we show that systems can be developed for this task. Such systems could be used to enhance toxicity detection datasets with more context-dependent posts or to suggest when moderators should consider the parent posts, which may not always be necessary and may introduce additional costs.", } ```

提供机构：

tasksource

原始信息汇总

数据集概述

数据集名称

Context Sensitivity Estimation in Toxicity Detection

数据集描述

本数据集包含10,000条用户帖子，每条帖子具有两种类型的毒性标签，这些标签由注释者在考虑以下两种情况时获得：

同时考虑当前帖子和前一个帖子作为上下文。
仅考虑当前帖子。

数据集用途

用于开发和评估上下文敏感性估计任务的系统，旨在识别那些其感知毒性随上下文（前一个帖子）变化的帖子。

发布时间

2021年8月

数据集链接

许可证

Apache-2.0

5,000+

优质数据集

54 个

任务类型

进入经典数据集