five

Hate Speech Library in Spanish

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/records/11099511
下载链接
链接失效反馈
官方服务:
资源简介:
Library of hate speech detected in digital news media in Spain, the result of the "Hatemedia" project (project PID2020-114584GB-I00), financed by the State Research Agency - Ministry of Science and Innovation.   Hate expressions show 7,210 more repeated simple and compound slogans, and from the semantic point of view tend to be hate in digital news media in Spain. The preparation of this final document required the following phases:   LABELING OF EXPRESSIONS AND EXTRACTION OF SLOGMS. In the first phase, a total of 476,753 messages associated with digital news media in Spain were reviewed. Approximately 4.5% of messages with expressions tending toward hatred were identified. From the total number of messages identified, stop-words were removed, and anomalous data (that did not belong to a known language or were diminutive of it) were identified and manually reviewed to identify both simple and compound slogans that tended towards hatred. IDENTIFICATION OF DUPLICATES: In the first phase, two lists were made, the first of simple lemmas and the second of compound lemmas. The first step was to filter these two lists to identify repeated lemmas, obtaining these two libraries where each lemma appears only once. DDBB INTEGRATION: Next, in the third phase, we proceeded to join both libraries to build a final library that integrated all the lemmas, both simple and compound. Finally, final filtering was carried out to ensure that the lemmas were not repeated.   Authors: - Elias Said-Hung, Max Römer Pieretti, Julio Montero-Díaz, Alberto De Lucas, Javier Martínez Torres.   Supported by: - POSSIBLE S.L.   For more information: - https://www.hatemedia.es/ or contact elias.said@unir.net
创建时间:
2024-05-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作