five

Uzbek Speech Corpus (USC)

收藏
arXiv2021-07-30 更新2024-06-21 收录
下载链接:
https://github.com/IS2AI/Uzbek ASR
下载链接
链接失效反馈
官方服务:
资源简介:
Uzbek Speech Corpus (USC) 是由信息技术和通信大学创建的开放源代码乌兹别克语音数据集,包含105小时的转录音频记录,由958名不同年龄和地区的演讲者提供。数据集主要用于自动语音识别(ASR)任务,但也支持其他语音相关任务,如语音合成和翻译。USC是首个公开可用的乌兹别克语音数据集,适用于学术和商业用途,旨在推动乌兹别克语在语音应用中的发展和使用,特别是为有特殊需求的人群提供辅助技术。

Uzbek Speech Corpus (USC) is an open-source Uzbek speech dataset developed by the University of Information Technology and Communications. It contains 105 hours of transcribed audio recordings contributed by 958 speakers across different age groups and regions. The dataset is primarily designed for automatic speech recognition (ASR) tasks, while also supporting other speech-related applications such as speech synthesis and speech translation. As the first publicly available Uzbek speech dataset, USC supports both academic and commercial usage, with the goal of promoting the development and application of the Uzbek language in speech technologies, particularly to provide assistive technologies for people with special needs.
提供机构:
信息技术和通信大学,以穆罕默德·阿尔-花拉子米命名
创建时间:
2021-07-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作