five

AsoSoft Speech Corpus

收藏
arXiv2021-02-15 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2102.07412v1
下载链接
链接失效反馈
官方服务:
资源简介:
AsoSoft Speech Corpus是首个为中央库尔德语设计的大型语音识别数据集,由德黑兰大学的研究团队创建。该数据集包含超过43小时的语音数据,由576位不同背景的说话者在控制环境和社交网络环境中录制。数据集旨在覆盖库尔德语的声学变异,特别是通过设计包含所有中央库尔德语双音素的句子集合来实现。此外,数据集还包括一个包含11个不同文档主题的测试集,用于评估语音识别系统在不同领域的性能。AsoSoft Speech Corpus不仅用于语音识别系统的开发,还适用于其他语音处理任务,如说话人识别。

AsoSoft Speech Corpus is the first large-scale speech recognition dataset tailored for Central Kurdish, developed by a research team from the University of Tehran. This dataset contains over 43 hours of speech data recorded by 576 speakers with diverse backgrounds in both controlled environments and social network settings. It is designed to cover the acoustic variations of Central Kurdish, specifically by curating a set of sentences that encompass all Central Kurdish diphones. Additionally, the dataset includes a test set with 11 distinct document topics, which is utilized to evaluate the performance of speech recognition systems across diverse domains. AsoSoft Speech Corpus is applicable not only to the development of speech recognition systems but also to other speech processing tasks such as speaker recognition.
提供机构:
德黑兰大学
创建时间:
2021-02-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作