five

Slovene SuperGLUE Benchmark

收藏
arXiv2022-02-10 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2202.04994v1
下载链接
链接失效反馈
官方服务:
资源简介:
Slovene SuperGLUE Benchmark是一个结合机器与人工翻译的斯洛文尼亚语数据集,旨在评估自然语言处理模型在斯洛文尼亚语环境下的性能。该数据集包含约120,000字的翻译内容,覆盖多种任务类型,如问答、自然语言推理等。创建过程中,面对斯洛文尼亚语的形态和语法差异,采用了机器翻译与人工校对相结合的方法。此数据集主要用于评估和比较不同语言模型在斯洛文尼亚语中的表现,尤其是在单语、跨语和多语环境下的性能,以推动对资源较少语言的自然语言处理研究。

Slovene SuperGLUE Benchmark is a Slovenian-language dataset integrating machine and human translation, designed to evaluate the performance of natural language processing models in Slovenian contexts. This dataset contains approximately 120,000 words of translated content, covering multiple task types such as question answering and natural language inference. During its development, in view of the morphological and grammatical variations of Slovenian, a method combining machine translation and manual proofreading was adopted. This dataset is primarily used to evaluate and compare the performance of different language models in Slovenian, especially in monolingual, cross-lingual and multilingual environments, so as to advance natural language processing research on low-resource languages.
提供机构:
卢布尔雅那大学计算机与信息科学学院
创建时间:
2022-02-10
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作