GAHD
收藏arXiv2024-03-29 更新2024-06-21 收录
下载链接:
https://github.com/jagol/gahd
下载链接
链接失效反馈官方服务:
资源简介:
GAHD是一个包含约11,000个例子的德语对抗性仇恨言论数据集,由苏黎世大学创建。数据集通过动态对抗数据收集(DADC)方法生成,旨在通过利用模型弱点来提高数据集的质量。GAHD数据集涵盖了多种保护群体和争议话题,特别关注德国、奥地利和瑞士的文化背景。数据集的创建过程中,研究者探索了多种支持标注者的策略,以提高数据收集的效率和多样性。GAHD数据集的应用领域主要集中在提高仇恨言论检测模型的鲁棒性,通过训练和测试这些模型,以解决在线仇恨言论的问题。
GAHD is a German adversarial hate speech dataset containing approximately 11,000 examples, created by the University of Zurich. It was generated via the Dynamic Adversarial Data Collection (DADC) method, which aims to improve dataset quality by exploiting model vulnerabilities. The GAHD dataset covers a wide range of protected groups and controversial topics, with a particular focus on the cultural contexts of Germany, Austria, and Switzerland. During the dataset creation process, the researchers explored multiple strategies to support annotators, so as to enhance the efficiency and diversity of data collection. The primary application scenarios of the GAHD dataset focus on improving the robustness of hate speech detection models, through training and testing these models to address the issue of online hate speech.
提供机构:
苏黎世大学
创建时间:
2024-03-29



