five

NorBench

收藏
arXiv2023-05-06 更新2024-06-21 收录
下载链接:
https://github.com/ltgoslo/norbench
下载链接
链接失效反馈
官方服务:
资源简介:
NorBench是一个专为评估挪威语言模型设计的标准化基准数据集,由奥斯陆大学语言技术组创建。该数据集包含10个子数据集,涵盖了从形态句法分析到机器翻译等多种自然语言处理任务。每个数据集都经过精心设计,以确保评估的公正性和标准化。NorBench不仅用于评估现有模型,还推动了新模型的开发,如NorBERT3和NorT5,这些模型在多项任务中达到了最先进的性能。数据集的应用领域广泛,旨在解决挪威语言处理中的各种挑战,包括提高语言模型的准确性和减少性别偏见。

NorBench is a standardized benchmark dataset specifically designed for evaluating Norwegian language models, created by the Language Technology Group at the University of Oslo. This dataset comprises 10 sub-datasets covering a wide range of natural language processing (NLP) tasks, from morphological and syntactic analysis to machine translation. Each sub-dataset is meticulously designed to ensure the fairness and standardization of evaluations. NorBench is not only used to evaluate existing models, but also promotes the development of new models such as NorBERT3 and NorT5, which have achieved state-of-the-art performance across multiple tasks. The dataset has broad applicability, aiming to address various challenges in Norwegian language processing, including improving the accuracy of language models and reducing gender bias.
提供机构:
奥斯陆大学语言技术组
创建时间:
2023-05-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作