XTREMESPEECH
收藏arXiv2022-03-22 更新2024-06-21 收录
下载链接:
https://github.com/antmarakis/xtremespeech
下载链接
链接失效反馈官方服务:
资源简介:
XTREMESPEECH是由慕尼黑路德维希大学的研究团队创建的一个包含20,297条社交媒体文本的新数据集,涵盖巴西、德国、印度和肯尼亚四个国家。该数据集的独特之处在于,它直接由受影响的社区参与数据的收集和标注,而非由公司或政府控制。这种包容性的方法使得数据集更能代表实际发生的网络言论,并可能有助于移除对边缘化社区造成最大伤害的社交媒体内容。XTREMESPEECH不仅用于研究极端言论的定义,还用于建立新的任务和基准,以及进行跨文化差异的分析。
XTREMESPEECH is a novel dataset created by a research team at Ludwig Maximilian University of Munich, containing 20,297 social media text entries spanning four countries: Brazil, Germany, India, and Kenya. The unique strength of this dataset lies in its direct involvement of affected communities in data collection and annotation, rather than being controlled by corporations or governments. This inclusive approach enables the dataset to better reflect real-world online discourse, and may facilitate the removal of social media content that inflicts the most harm on marginalized communities. XTREMESPEECH is used not only for researching the definition of extremist speech, but also for establishing new tasks and benchmarks, as well as conducting cross-cultural difference analyses.
提供机构:
慕尼黑路德维希大学
创建时间:
2022-03-22



