five

"Hate speech dataset with style perturbation"

收藏
DataCite Commons2026-03-16 更新2026-05-03 收录
下载链接:
https://ieee-dataport.org/documents/hate-speech-dataset-style-perturbation
下载链接
链接失效反馈
官方服务:
资源简介:
"We present a style-perturbed benchmark for robustness evaluation in hate speech detection. The benchmark is constructed from three widely used datasets, namely IHC, SBIC, and DynaHate, and introduces multiple stylistic perturbation settings while preserving the original label semantics as much as possible. For each dataset, we retain the undisturbed version together with several perturbed variants, enabling controlled comparison between clean and style-shifted inputs. The resulting resource is designed to test whether a model relies on task-relevant semantic content or exhibits residual dependence on superficial stylistic cues. In addition to the perturbed texts and class labels, the benchmark preserves sample-level metadata to support reproducible evaluation and fine-grained analysis across perturbation types and annotation rounds. This benchmark provides a practical testbed for studying robustness, prediction consistency, and label stability under semantically preserving stylistic variation, and is particularly suitable for evaluating large language model based classifiers and causal content-style disentanglement methods."
提供机构:
IEEE DataPort
创建时间:
2026-03-16
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作