INEL Selkup Corpus
收藏DataCite Commons2024-04-09 更新2025-04-16 收录
下载链接:
https://www.fdr.uni-hamburg.de/record/9754
下载链接
链接失效反馈官方服务:
资源简介:
<strong>Corpus Citation</strong>
<em>Brykina, Maria; Orlova, Svetlana; Wagner-Nagy, Beáta. 2021. “INEL Selkup Corpus.” Version 2 .0. Publication date<br>
2021-12-31. https://hdl.handle.net/11022/0000-0007-F4D9-1. Archived at Universität Hamburg. In: The INEL corpora<br>
of indigenous Northern Eurasian languages. https://hdl.handle.net/11022/0000-0007-F45A-1</em>
<strong>Corpus Description</strong>
The INEL Selkup corpus has been created within the long-term INEL project ("Grammatical Descriptions, Corpora and Language Technology for Indigenous Northern Eurasian Languages”), 2016–2033. The corpus enables typologically aware corpus-based grammatical research on the Selkup language and expands the documentation of the lesser described indigenous languages of Northern Eurasia.
The INEL Selkup corpus is composed of texts from the archive of Angelina Ivanovna Kuzmina (1924–2002), who gathered a large amount of material on Selkup in almost all regions where the Selkup people lived between 1962–1977. The archive was transferred by A.I. Kuzmina to Eugen Helimski and acquired by the Universität Hamburg in 2001. Most texts in the corpus originate from the handwritten part of the archive, the others come from sound recordings made by A.I. Kuzmina, transcribed and translated within the INEL project.
<strong>Funding</strong>
The corpus has been produced in the context of the joint research funding of the German Federal Government and Federal States in the Academies’ Programme, with funding from the Federal Ministry of Education and Research and the Free and Hanseatic City of Hamburg. The Academies’ Programme is coordinated by the Union of the German Academies of Sciences and Humanities.
<strong>Contributions/Acknowledgements</strong>
Audio recordings made by Angelina Kuzmina were transcribed and translated by native speakers of Selkup:
Irina Anatolyevna Korobejnikova, written transcription and Russian translation of audio in Central and Southern dialects
Natalya Platonovna Izhenbina, written transcription and Russian translation of audio in Southern dialects
Svetlana Nikitichna Sankevich (Kunina), oral transcription and Russian translation of audio in Northern dialects
Evgeniya Sergeevna Smorgunova (Irikova), oral and written transcription and Russian translation of audio in Northern dialects
Valentina Vladimirovna Tamelkina, oral transcription and Russian translation of audio in Northern dialects
For individual contributions to the collecting, transcribing and analyzing of individual texts, please refer to the user documentation and to the corpus metadata.
The web-based search interface is using the Tsakonian Corpus platform developed by Dr. Timofey Arkhangelskiy, Humboldt Research Fellow at IFUU, Hamburg University
<strong>New in release 2 .0</strong>
The corpus now contains 352 transcripts from 89 speakers, representing the dialects of Taz, Upper Tolka,
Baikha (Northern), Narym and Tym (Central), Middle Ob, Chaya and Ket (Southern). These contain 14509
sentences and 81498 words in total.
Many texts have been provided with annotations for syntactic functions and semantic roles.
Corrections to audio transcriptions, glossing and other annotations.
Dialectal attribution of several speakers has been revised.
The remaining n on-glossed texts from the Kuzmina archive have also been added to the corpus for completeness. These include 3 texts from the written part of the archive and 40 audio recordings, for 20 of which a preliminary transcription is provided.
**语料库引用**
*布尔基娜,玛丽亚;奥尔洛娃,斯韦特兰娜;瓦格纳-纳吉,贝娅塔. 2021.《INEL塞尔库普语(Selkup)语料库》(INEL Selkup Corpus),版本2.0,发布日期2021-12-31。https://hdl.handle.net/11022/0000-0007-F4D9-1。存档于汉堡大学。收录于《欧亚北部原住民语言INEL语料库》(The INEL corpora of indigenous Northern Eurasian languages),https://hdl.handle.net/11022/0000-0007-F45A-1*
**语料库说明**
INEL塞尔库普语(Selkup)语料库依托2016-2033年的长期INEL项目("Grammatical Descriptions, Corpora and Language Technology for Indigenous Northern Eurasian Languages",即《欧亚北部原住民语言语法描述、语料库与语言技术》)开发而成。本语料库支持基于语料库、符合类型学视角的塞尔库普语(Selkup)语法研究,并完善对欧亚北部地区研究较少的原住民语言的文献记录。
INEL塞尔库普语(Selkup)语料库的文本来源于安吉莉娜·伊万诺夫娜·库兹明娜(Angelina Ivanovna Kuzmina,1924–2002)的档案。库兹明娜于1962年至1977年间在塞尔库普(Selkup)族人聚居的几乎所有地区收集了大量塞尔库普语(Selkup)语料。该档案由A.I.库兹明娜移交至欧根·赫利姆斯基,并于2001年被汉堡大学收藏。语料库中的大部分文本来自该档案的手写部分,其余文本则来源于A.I.库兹明娜录制的有声资料,这些资料已在INEL项目中完成转录与翻译。
**资助情况**
本语料库的制作依托德国联邦政府与联邦州联合开展的“科学院计划”研究资助项目,资助方为德国联邦教育与研究部以及自由汉萨同盟汉堡市。“科学院计划”由德国科学院与人文科学院联合会统筹协调。
**贡献与致谢**
安吉莉娜·库兹明娜录制的有声资料由塞尔库普语(Selkup)母语者完成转录与翻译:
- 伊琳娜·阿纳托利耶夫娜·科罗别伊尼科娃:中南部方言有声资料的书面转录与俄语翻译
- 娜塔莉亚·普拉托诺夫娜·伊任比娜:南部方言有声资料的书面转录与俄语翻译
- 斯韦特兰娜·尼基季奇娜·桑克维奇(库尼娜):北部方言有声资料的口头转录与俄语翻译
- 叶夫根尼娅·谢尔盖耶夫娜·斯莫尔古诺娃(里科娃):北部方言有声资料的口头与书面转录及俄语翻译
- 瓦伦蒂娜·弗拉基米罗夫娜·塔梅尔基纳:北部方言有声资料的口头转录与俄语翻译
如需了解个人在单篇文本收集、转录与分析方面的贡献,请查阅用户文档及语料库元数据。
本项目的网页端检索界面采用了由汉堡大学IFUU洪堡研究学者季莫费·阿尔汉格尔斯基博士开发的Tsakonian语料库平台。
**2.0版本更新内容**
本次更新的语料库包含来自89位说话者的352份转录文本,覆盖塔兹、上托尔卡、巴伊哈(北部)、纳里姆、蒂姆(中部)、中鄂毕、恰亚以及克特(南部)等方言区,总计包含14509个句子与81498个词。
多数文本已添加句法功能与语义角色标注。
修正了有声转录、语素标注(glossing)及其他标注内容。
修订了多位说话者的方言归属信息。
为完善语料库完整性,新增了库兹明娜档案中尚未完成语素标注(glossing)的剩余文本,其中包括3份档案手写文本与40份有声录音资料,其中20份已提供初步转录文本。
提供机构:
Universität Hamburg
创建时间:
2021-12-17



