Google Speech Commands (SC09)

Name: Google Speech Commands (SC09)
Creator: Google
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

https://github.com/RF5/simple-speech-commands

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含了一系列孤立的口语单词，特别是从“零”到“九”的十个数字发音。这些发音由不同的说话人在不同的声道条件下说出，使得该数据集成为无条件语音合成领域的一个具有挑战性的基准。该数据集主要用于训练和验证论文中提到的模型，重点关注衡量生成发音的质量和多样性。数据集中的发音大约持续一秒钟，采样率为16千赫兹，任务是无条件语音合成。

This dataset contains a collection of isolated spoken words, specifically ten digit pronunciations ranging from "zero" to "nine". These pronunciations are produced by different speakers under various vocal tract conditions, making it a challenging benchmark in the field of unconditional speech synthesis. It is primarily used for training and validating the models mentioned in the paper, with a focus on evaluating the quality and diversity of generated pronunciations. Each pronunciation in the dataset lasts approximately one second with a sampling rate of 16 kHz, and the targeted task is unconditional speech synthesis.

提供机构：

Google

搜集汇总

数据集介绍

背景与挑战

背景概述

Google Speech Commands (SC09)是一个用于语音分类任务的音频数据集，包含16kHz采样率的1秒语音片段，涵盖35个单词或10个数字子集（SC09）。该数据集常用于训练和评估深度学习模型，如预训练的Pytorch分类器，在测试集上达到高准确率（如SC09子集准确率96.1%-98.1%），并提供便捷的模型部署和训练工具。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集