MAB-Swedish

arXiv2023-09-16 更新2024-06-21 收录

下载链接：

https://github.com/LTU-Machine-Learning/bipolswedish.git

下载链接

链接失效反馈

官方服务：

资源简介：

MAB-Swedish是由吕勒奥理工大学机器学习组创建的一个大型多轴偏见标记数据集，包含194万样本，主要用于评估和解释瑞典语中的偏见。该数据集是通过机器翻译英文版本创建的，涉及多种偏见类型，如性别和种族偏见。创建过程中，数据集经过了质量控制，确保翻译的准确性。MAB-Swedish数据集的应用领域主要集中在自然语言处理中，旨在解决模型训练中的偏见问题，提高模型的公平性和准确性。

MAB-Swedish is a large multi-axis bias annotated dataset developed by the Machine Learning Group at Luleå University of Technology. It contains 1.94 million samples and is primarily used for evaluating and interpreting bias in Swedish language contexts. This dataset was constructed via machine translation of its English counterpart, covering multiple bias categories including gender and racial bias. Quality control procedures were carried out throughout the creation process to ensure the accuracy of the translations. The main application domain of the MAB-Swedish dataset is natural language processing, where it aims to address bias issues in model training and improve the fairness and accuracy of trained models.

提供机构：

机器学习组，EISLAB，数字服务与系统吕勒奥理工大学，瑞典

创建时间：

2023-01-28

5,000+

优质数据集

54 个

任务类型

进入经典数据集