five

CREOLEVAL

收藏
arXiv2024-05-06 更新2024-06-21 收录
下载链接:
https://github.com/hclent/CreoleVal
下载链接
链接失效反馈
官方服务:
资源简介:
CREOLEVAL数据集由丹麦奥尔堡大学等机构的研究人员创建,旨在为NLP研究提供克里奥尔语的资源。该数据集涵盖28种克里奥尔语,涉及阅读理解、关系分类和机器翻译等多个NLP任务。数据集通过零样本转移学习方法进行基线实验,旨在推动克里奥尔语在NLP和计算语言学领域的研究,并促进全球语言技术的公平性。数据集的创建过程中,研究者们特别强调了社区参与的重要性,并借鉴了参与式机器学习的最新建议,确保资源既能惠及克里奥尔语社区,也能满足NLP社区的需求。

The CREOLEVAL dataset was developed by researchers from Aalborg University (Denmark) and other institutions, with the objective of providing Creole language resources for natural language processing (NLP) research. This dataset encompasses 28 Creole languages and supports multiple NLP tasks including "reading comprehension", "relation classification", and "machine translation". Baseline experiments were conducted using zero-shot transfer learning methods, aiming to advance Creole language research in the fields of NLP and computational linguistics, and promote fairness in global language technology. During the dataset creation process, the researchers particularly emphasized the importance of community engagement, and drew on the latest recommendations from participatory machine learning to ensure that the resources can benefit both Creole-speaking communities and meet the needs of the NLP community.
提供机构:
奥尔堡大学, 丹麦
创建时间:
2023-10-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作