five

Hebrew as L4+ Learner Mistakes Corpus - research data

收藏
DataCite Commons2026-02-02 更新2026-05-04 收录
下载链接:
https://uj.rodbuk.pl/citation?persistentId=doi:10.57903/UJ/8EQ26H
下载链接
链接失效反馈
官方服务:
资源简介:
The "Hebrew as L4+ learner mistakes corpus" is a specialized linguistic dataset containing 753 documented errors collected from multilingual students acquiring Hebrew during a one-semester span. The dataset is provided in three formats: the original .xlsx file and two open-access versions (.csv and .ods). The collection is unique because it focuses on "L4+ learners" - subjects who already know Polish, English, and Arabic, and are learning Hebrew as their fourth or subsequent language. The data is organized into a single table documenting the mistaken form (in IPA), the target form (in IPA, Hebrew script and with Leipzig Glossing), English translations, and categorical metadata such as the learner's level (Year 1 vs. Year 2), the specific linguistic skill (speaking/reading), and the error type (phonology, syntax, morphology, lexis, or mixed). This dataset is particularly valuable for researchers studying cross-linguistic influence (CLI) and the acquisition of Semitic languages by multilinguals.
提供机构:
Jagiellonian University in Kraków
创建时间:
2026-02-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作