Hebrew as L4+ Learner Mistakes Corpus - research data

Name: Hebrew as L4+ Learner Mistakes Corpus - research data
Creator: Jagiellonian University in Kraków
Published: 2026-02-02 08:29:52
License: 暂无描述

DataCite Commons2026-02-02 更新2026-05-04 收录

下载链接：

https://uj.rodbuk.pl/citation?persistentId=doi:10.57903/UJ/8EQ26H

下载链接

链接失效反馈

官方服务：

资源简介：

The "Hebrew as L4+ learner mistakes corpus" is a specialized linguistic dataset containing 753 documented errors collected from multilingual students acquiring Hebrew during a one-semester span. The dataset is provided in three formats: the original .xlsx file and two open-access versions (.csv and .ods). The collection is unique because it focuses on "L4+ learners" - subjects who already know Polish, English, and Arabic, and are learning Hebrew as their fourth or subsequent language. The data is organized into a single table documenting the mistaken form (in IPA), the target form (in IPA, Hebrew script and with Leipzig Glossing), English translations, and categorical metadata such as the learner's level (Year 1 vs. Year 2), the specific linguistic skill (speaking/reading), and the error type (phonology, syntax, morphology, lexis, or mixed). This dataset is particularly valuable for researchers studying cross-linguistic influence (CLI) and the acquisition of Semitic languages by multilinguals.

提供机构：

Jagiellonian University in Kraków

创建时间：

2026-02-01

5,000+

优质数据集

54 个

任务类型

进入经典数据集