Mergen corpus
收藏DataCite Commons2021-01-28 更新2025-04-17 收录
下载链接:
https://auckland.figshare.com/articles/dataset/Mergen_corpus/13655678/1
下载链接
链接失效反馈官方服务:
资源简介:
<b>People involved</b><br>This corpus was created by Aidan Winberry in 2020 from a recording of Raisa Alekseevna Beldy reading<br>the text of a Nanai fairy-tale ”Mergen ningman”. The recording was made by Vasily Kharitonov.<br><b>Annotation scheme</b><br>The information about Nanai phonemes is taken from (Ko & Yurn, 2011).<br><b>Coding scheme</b><br>The sounds and phonemes are represented by their IPA symbols in Unicode.<br>The text of the fairy-tale is provided in Cyrilic Nanai orthography. Nanai writing system is nearly phonemic so a Latin transcription layer would simply copy the phonemic tier in IPA.<br>In several segments the sound is corrupted by background music; in this case the annotation on phonemic and phonetic level is omitted.<br><b>Annotation quality</b><br>Annotations were made without consulting with the dictionaries. All phonemes and allophonic variants<br>are marked aurally.<br>Diphthongs are not thoroughly marked.<br>The “Words” tier follows the text of the fairy-tale while the “Phonemes” and “Sounds” tiers represent<br>what is actually being said instead. Several utterances end with an ellipsis which marks correcting slips<br>of tongue. <br>
涉及人员
本语料库(corpus)由艾丹·温伯里(Aidan Winberry)于2020年根据拉伊萨·阿列克谢耶夫娜·别尔迪(Raisa Alekseevna Beldy)朗读纳奈语(Nanai)童话《Mergen ningman》的录音制作而成,该录音由瓦西里·哈里托诺夫(Vasily Kharitonov)录制。
标注方案
纳奈语音位相关信息引自(Ko & Yurn, 2011)。
编码方案
声音与音素以Unicode编码的国际音标(IPA)符号表示。本童话文本采用西里尔字母纳奈文正字法书写。纳奈文书写系统几乎完全符合音位规则,因此拉丁转写层级仅需直接复制音位层级的国际音标内容。在部分片段中,声音因背景音乐干扰而受损,此类情况下将省略音位与语音层面的标注。
标注质量
本次标注未查阅词典,所有音素与音位变体均通过听觉标注完成。双元音未进行全面标注。「词语」层级对应童话原文,而「音素」与「声音」层级则如实记录实际发音内容。部分语句以省略号结尾,用于标记口误修正内容。
提供机构:
The University of Auckland
创建时间:
2021-01-28



