RefutES

SSH Open MarketPlace2026-03-20 更新2026-03-21 收录

下载链接：

https://marketplace.sshopencloud.eu/dataset/cgZDdl

下载链接

链接失效反馈

官方服务：

资源简介：

The RefutES dataset was created for the research and evaluation of systems capable of generating counter-narratives to hate speech. The dataset is based on the CONAN-MT-SP corpus, which contains pairs of hate speech (HS) and counter-narrative (CN) messages targeting eight specific groups: people with disabilities, Jews, the LGBT+ community, migrants, Muslims, people of color, women, and other groups. To construct it, the original English CONAN-MT corpus was used as a starting point; its hate speech messages were translated into Spanish using the DeepL API and subsequently reviewed and corrected by human annotators. The associated counter-narratives were generated using language models (GPT-4) via a few-shot learning strategy and evaluated by human experts across multiple dimensions: level of offensiveness, stance toward the message, degree of informativeness, veracity, need for editing, and comparison between human and automated responses. Based on this process, RefutES selects only those counter-narratives considered “perfect”—that is, non-offensive, clearly at odds with the hate speech, informative, truthful, and requiring no editing. The corpus is divided into three subsets, each related to a different part of the competition: - Train split: contains 2496 HS-CN pairs. - Dev split: contains 279 HS-CN pairs. -Test split: contains 156 pairs HS-CN. 78 HS-CN pairs are generated by GPT-4 and manually annotated by humans and the others 78 HS-CN pairs generated by humans.

创建时间：

2026-03-20

5,000+

优质数据集

54 个

任务类型

进入经典数据集