abdulrub/hate_speech_dataset

Name: abdulrub/hate_speech_dataset
Creator: abdulrub
Published: 2025-02-11 06:59:50
License: 暂无描述

Hugging Face2025-02-11 更新2025-02-15 收录

下载链接：

https://hf-mirror.com/datasets/abdulrub/hate_speech_dataset

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集旨在用于微调语言模型，特别是Qwen2.5-1.5B-Instruct模型，用于社交媒体文本（推文）中的仇恨言论检测任务。数据集包括隐性仇恨言论和显性仇恨言论的例子，目的是提高小型语言模型在此挑战性任务中的性能。数据集由两个现有数据集组合而成：SALT-NLP/ImplicitHate数据集和TweetEval数据集的hate配置。数据集被划分为训练集、验证集和测试集，分别包含2500、150和100个示例，用于训练、调整和最终评估模型性能。每个示例包含推文的文本内容和标签，标签分为非仇恨言论和仇恨言论。

This dataset is designed for fine-tuning language models, particularly the Qwen2.5-1.5B-Instruct model, for the task of hate speech detection in social media text (tweets). It includes examples of both implicit and explicit hate speech, aiming to improve the performance of smaller language models on this challenging task. The dataset is a combination of two existing datasets: the SALT-NLP/ImplicitHate dataset and the TweetEval datasets hate configuration. It is split into a training set with 2500 examples, a validation set with 150 examples, and a test set with 100 examples, for training, tuning, and final evaluation of the models performance. Each example consists of the text content of a tweet and a label, categorized as not hate speech or hate speech.

提供机构：

abdulrub

5,000+

优质数据集

54 个

任务类型

进入经典数据集