The LnNor Corpus: A spoken multilingual corpus of non-native and native Norwegian, English and Polish – Part 2
收藏DataCite Commons2024-04-26 更新2024-07-13 收录
下载链接:
https://researchportal.amu.edu.pl/info/researchdata/UAM22e4ff296ea8497fbe64a214a78c3220/
下载链接
链接失效反馈官方服务:
资源简介:
The LnNor corpus was created as part of the data collection in two projects: CLIMAD (Cross-linguistic influence in multilingualism across domains: phonology and syntax) and ADIM (Across-domain Investigations in Multilingualism: Modeling L3 Acquisition in Diverse Settings), led by Prof. Magdalena Wrembel at Adam Mickiewicz University in Poznań, Poland and by Prof. Marit Westergaard at the Arctic University of Norway, from December 2021 to April 2024 with funding from the National Science Centre (NCN) in Poland and Norway Grants. Corpus data collection covered a broad range of speech elicitation tasks. The recordings consist of word, sentence and text reading, picture story description, video story retelling, spontaneous speech and socio-phonetic interviews in Polish, English and Norwegian. The corpus contains metadata based on the Language History Questionnaire (Li et al. 2020) such as age, gender, native languages, proficiency level, length of language exposure, age of onset. The LnNor corpus has been created to represent multilingual speech with a focus on L3/Ln Norwegian learners as well as native controls of Norwegian, English and Polish. The LnNOR corpus part 1 consists of 1073 annotated files from 78 speakers. The speakers included 53 L1 Polish, 16 L1 Norwegian and 9 L1 speakers of other European languages. The total recording time is approximately 35 hours and the full size is 18 GB. The recordings in the released LnNor corpus part 1 cover data collected between 2021-2022.
提供机构:
OMEGA-PSIR
创建时间:
2024-04-26



