DaLAJ 1.0
收藏arXiv2021-05-14 更新2024-06-21 收录
下载链接:
https://spraakbanken.gu.se/en/resources/swedishglue
下载链接
链接失效反馈官方服务:
资源简介:
DaLAJ 1.0是由哥德堡大学创建的瑞典语语言可接受性判断数据集,包含9596个句子。该数据集基于SweLL第二语言学习者数据,涵盖不同熟练水平的作文。创建过程中,为了遵守GDPR规定,对学习者作文进行了句子混洗并移除了部分元数据,保留了母语和课程水平信息。DaLAJ 1.0主要用于自然语言理解任务,特别是语言学习和错误检测领域,旨在通过专家判断提高语言处理模型的准确性。
DaLAJ 1.0 is a Swedish linguistic acceptability judgment dataset developed by the University of Gothenburg, containing 9,596 sentences. This dataset is built upon second language learner data from SweLL, covering essays with varying proficiency levels. During the dataset construction process, to comply with GDPR regulations, sentence shuffling was performed on the learner essays, and partial metadata was removed while preserving the native language and course proficiency level information. DaLAJ 1.0 is primarily intended for natural language understanding tasks, particularly in the domains of language learning and error detection, with the goal of enhancing the accuracy of language processing models via expert judgments.
提供机构:
哥德堡大学
创建时间:
2021-05-14



