five

The Warsaw Multimodal Hate Speech Database (WMHS)

收藏
IEEE2026-04-17 收录
下载链接:
https://ieee-dataport.org/documents/warsaw-multimodal-hate-speech-database-wmhs
下载链接
链接失效反馈
官方服务:
资源简介:
Hate speech, defined as derogatory statements directedagainst individuals or groups because of their immutablecharacteristics, has far-reaching negative consequences. Theeffects of hatetext (written hate speech) have been studiedextensively. However, the role of other modalities, particularlyhatespeech (spoken hate speech), remains underexplored due tothe lack of annotated multimodal databases and the ambiguitythat arises from varying definitions of hate speech. The presentwork introduces the Warsaw Multimodal Hate Speech Database(WMHS), a new resource designed to fill these gaps. The databasecontains over 9 hours of manually annotated multimodal Polishhate speech, primarily targeting six groups and covering fivetypes of content collected from YouTube, Facebook, and thePolish platform BanBye.We present initial validation results fromtwo pilot studies (N = 20), demonstrating high levels of offensivenessand emotional charge. We also describe initial results forautomatic detection of hate speech from text and speech, withaccuracies up to 98.7% for hatetext, 86.5% for hatespeech, and97.8% for the joint model. Despite the poor performance of thespeech model compared to text, the model correctly identified twoout of three misclassified data, highlighting its complementaryvalue. We discuss how the WMHS can support a broad rangeof future work, including training machine learning models andpsychological research on reactions to hate speech. The WMHS,together with associated files, is availabe at OSF:https:\/\/osf.io\/5xaqh\/?view only=b8ff0e59548f48b99f8c66863146a25e
提供机构:
Tanja Schultz; Rathi Adarshi Rammohan; Aleksandra Świderska; Dennis Kuester; Dominik Puchała
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作