five

Maltese crowS-pairs dataset

收藏
Mendeley Data2024-06-27 更新2024-06-28 收录
下载链接:
https://drum.um.edu.mt/articles/dataset/Maltese_crowS-pairs_dataset/26056957/1
下载链接
链接失效反馈
官方服务:
资源简介:
Warning: This dataset contains explicit statements of offensive stereotypes which may be upsetting.The study of bias, fairness and social impact in Natural Language Processing (NLP) lacks resources in languages other than English. Our objective is to support the evaluation of bias in language models in a multilingual setting. We use stereotypes across nine types of biases to build a corpus containing contrasting sentence pairs, one sentence that presents a stereotype concerning an underadvantaged group and another minimally changed sentence, concerning a matching advantaged group.In total, we produced 11,139 new sentence pairs that cover stereotypes dealing with nine types of biases in seven cultural contexts. We use the final resource for the evaluation of relevant monolingual and multilingual masked language models.This file contains the sentence pairs localised to the Maltese context in the Maltese language.Other languages are available here: https://gitlab.inria.fr/corpus4ethics/multilingualcrowspairsThe paper describing this work is available here: https://www.um.edu.mt/library/oar/handle/123456789/121722https://aclanthology.org/2024.lrec-main.1545/To use this dataset, please use the following citation:Karen Fort, Laura Alonso Alemany, Luciana Benotti, Julien Bezançon, Claudia Borg, Marthese Borg, Yongjian Chen, Fanny Ducel, Yoann Dupont, Guido Ivetta, Zhijian Li, Margot Mieskes, Marco Naguib, Yuyan Qian, Matteo Radaelli, Wolfgang S. Schmeisser-Nieto, Emma Raimundo Schulz, Thiziri Saci, Sarah Saidi, et al.. 2024. Your Stereotypical Mileage May Vary: Practical Challenges of Evaluating Biases in Multiple Languages and Cultural Contexts. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 17764–17769, Torino, Italia. ELRA and ICCL.
创建时间:
2024-06-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作