five

Spoken corpus of Udmurt dialects

收藏
DataCite Commons2025-03-26 更新2025-04-16 收录
下载链接:
https://www.fdr.uni-hamburg.de/record/17149
下载链接
链接失效反馈
官方服务:
资源简介:
This deposit contains transcriptions of oral interviews and conversations in various dialects of Udmurt (Permic &lt; Uralic; ISO 639-2 code udm). It contains 25 recordings with transcripts with a total of 93.6 thousand words. <strong>Description of the contents</strong> The contents are as follows: eaf (directory as ZIP archive): sound files and their transcripts in ELAN metadata_texts.csv: tab-delimited metadata for the transcriptions metadata_speakers.csv: tab-delimited metadata for speakers readme.txt: documentation <strong>Transcriptions</strong> All sound recordings are in WAV format, although some of them were originally recorded in a format with compression (see metadata). Transcriptions are stored in ELAN files. Each ELAN file is linked to one recording. The transcriptions were not thoroughly proofread and may contain mistakes. Please listen to the relevant segments to make sure their transcription is accurate. See readme.txt for further details. <strong>Metadata</strong> The transcript-level metadata are: filename (without the extension); code of the collector (TA: Timofey Arkhangelskiy; NA: Nikolai Anisimov; YZ: Iuliia Zubova); name of the place where recording was made (in Russian); original format of the recording (wav/wma/mp3); genre; date of the recording. The speaker-level metadata are: code of the speaker; speaker type: native vs. (non-native) linguist; sex (F/M); year of birth (when known); variety of Udmurt they represent; usually this is the settlement where the speaker was born or spent their formative years. The recordings were transcribed by Tatiana Anisimova and Nikolai Anisimov. Sound-alignment was performed by Timofey Arkhangelskiy and Marina Pankova. <strong>References</strong> ELAN (Version 6.9) [Computer software]. (2024). Nijmegen: Max Planck Institute for Psycholinguistics, The Language Archive. Retrieved from https://archive.mpi.nl/tla/elan <strong>Contact</strong> If you have any questions or would like to propose a collaboration, please email Timofey Arkhangelskiy at timarkh@gmail.com.
提供机构:
Universität Hamburg
创建时间:
2025-03-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作