"The Warsaw Multimodal Hate Speech Database (WMHS)"
收藏DataCite Commons2025-06-19 更新2026-05-03 收录
下载链接:
https://ieee-dataport.org/documents/warsaw-multimodal-hate-speech-database-wmhs
下载链接
链接失效反馈官方服务:
资源简介:
"Hate speech, defined as derogatory statements directedagainst individuals or groups because of their immutablecharacteristics, has far-reaching negative consequences. Theeffects of hatetext (written hate speech) have been studiedextensively. However, the role of other modalities, particularlyhatespeech (spoken hate speech), remains underexplored due tothe lack of annotated multimodal databases and the ambiguitythat arises from varying definitions of hate speech. The presentwork introduces the Warsaw Multimodal Hate Speech Database(WMHS), a new resource designed to fill these gaps. The databasecontains over 9 hours of manually annotated multimodal Polishhate speech, primarily targeting six groups and covering fivetypes of content collected from YouTube, Facebook, and thePolish platform BanBye.We present initial validation results fromtwo pilot studies (N = 20), demonstrating high levels of offensivenessand emotional charge. We also describe initial results forautomatic detection of hate speech from text and speech, withaccuracies up to 98.7% for hatetext, 86.5% for hatespeech, and97.8% for the joint model. Despite the poor performance of thespeech model compared to text, the model correctly identified twoout of three misclassified data, highlighting its complementaryvalue. We discuss how the WMHS can support a broad rangeof future work, including training machine learning models andpsychological research on reactions to hate speech. The WMHS,together with associated files, is availabe at OSF:https:\/\/osf.io\/5xaqh\/?view only=b8ff0e59548f48b99f8c66863146a25e"
提供机构:
IEEE DataPort
创建时间:
2025-06-19



