Google Speech Commands (SC09)
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/RF5/simple-speech-commands
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了一系列孤立的口语单词,特别是从“零”到“九”的十个数字发音。这些发音由不同的说话人在不同的声道条件下说出,使得该数据集成为无条件语音合成领域的一个具有挑战性的基准。该数据集主要用于训练和验证论文中提到的模型,重点关注衡量生成发音的质量和多样性。数据集中的发音大约持续一秒钟,采样率为16千赫兹,任务是无条件语音合成。
This dataset contains a collection of isolated spoken words, specifically ten digit pronunciations ranging from "zero" to "nine". These pronunciations are produced by different speakers under various vocal tract conditions, making it a challenging benchmark in the field of unconditional speech synthesis. It is primarily used for training and validating the models mentioned in the paper, with a focus on evaluating the quality and diversity of generated pronunciations. Each pronunciation in the dataset lasts approximately one second with a sampling rate of 16 kHz, and the targeted task is unconditional speech synthesis.
提供机构:
Google



