Ainu Speech Corpus
收藏arXiv2020-05-16 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2002.06675v3
下载链接
链接失效反馈官方服务:
资源简介:
Ainu Speech Corpus是由京都大学信息学院的研究团队基于Ainu博物馆和Nibutani Ainu文化博物馆提供的数据构建的,旨在为Ainu语言的自动语音识别(ASR)系统提供支持。该数据集包含约700小时的Ainu民间传说和民歌录音,其中选择了民间传说部分用于构建ASR模型。数据集的创建过程涉及对录音的分类和修改,以便于ASR模型的训练。该数据集的应用领域主要是语言保护和文化遗产的保存,特别是针对濒危语言Ainu的语音识别技术开发。
The Ainu Speech Corpus was developed by a research team from the Graduate School of Informatics, Kyoto University, using data provided by the Ainu Museum and the Nibutani Ainu Culture Museum, with the aim of supporting automatic speech recognition (ASR) systems for the Ainu language. This corpus contains approximately 700 hours of audio recordings of Ainu folklore and folk songs, and the folklore portion was selected for constructing ASR models. The development of this corpus involved the categorization and refinement of the audio recordings to facilitate the training of ASR models. The primary application scenarios of this dataset are language preservation and cultural heritage conservation, particularly for the development of speech recognition technologies for the endangered Ainu language.
提供机构:
京都大学信息学院
创建时间:
2020-02-17



