Hebrew as L4+ Learner Mistakes Corpus - research data
收藏DataCite Commons2026-02-02 更新2026-05-04 收录
下载链接:
https://uj.rodbuk.pl/citation?persistentId=doi:10.57903/UJ/8EQ26H
下载链接
链接失效反馈官方服务:
资源简介:
The "Hebrew as L4+ learner mistakes corpus" is a specialized linguistic dataset containing 753 documented errors collected from multilingual students acquiring Hebrew during a one-semester span. The dataset is provided in three formats: the original .xlsx file and two open-access versions (.csv and .ods).
The collection is unique because it focuses on "L4+ learners" - subjects who already know Polish, English, and Arabic, and are learning Hebrew as their fourth or subsequent language. The data is organized into a single table documenting the mistaken form (in IPA), the target form (in IPA, Hebrew script and with Leipzig Glossing), English translations, and categorical metadata such as the learner's level (Year 1 vs. Year 2), the specific linguistic skill (speaking/reading), and the error type (phonology, syntax, morphology, lexis, or mixed). This dataset is particularly valuable for researchers studying cross-linguistic influence (CLI) and the acquisition of Semitic languages by multilinguals.
提供机构:
Jagiellonian University in Kraków
创建时间:
2026-02-01



