Uzbek Speech Corpus (USC)
收藏arXiv2021-07-30 更新2024-06-21 收录
下载链接:
https://github.com/IS2AI/Uzbek ASR
下载链接
链接失效反馈官方服务:
资源简介:
Uzbek Speech Corpus (USC) 是由信息技术和通信大学创建的开放源代码乌兹别克语音数据集,包含105小时的转录音频记录,由958名不同年龄和地区的演讲者提供。数据集主要用于自动语音识别(ASR)任务,但也支持其他语音相关任务,如语音合成和翻译。USC是首个公开可用的乌兹别克语音数据集,适用于学术和商业用途,旨在推动乌兹别克语在语音应用中的发展和使用,特别是为有特殊需求的人群提供辅助技术。
Uzbek Speech Corpus (USC) is an open-source Uzbek speech dataset developed by the University of Information Technology and Communications. It contains 105 hours of transcribed audio recordings contributed by 958 speakers across different age groups and regions. The dataset is primarily designed for automatic speech recognition (ASR) tasks, while also supporting other speech-related applications such as speech synthesis and speech translation. As the first publicly available Uzbek speech dataset, USC supports both academic and commercial usage, with the goal of promoting the development and application of the Uzbek language in speech technologies, particularly to provide assistive technologies for people with special needs.
提供机构:
信息技术和通信大学,以穆罕默德·阿尔-花拉子米命名
创建时间:
2021-07-30



