Voxlab/Synthetic-Spoken-Digit-Dataset
收藏Hugging Face2023-07-24 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Voxlab/Synthetic-Spoken-Digit-Dataset
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mpl-2.0
task_categories:
- conversational
- translation
- audio-classification
- automatic-speech-recognition
- text-to-speech
language:
- en
- es
- fr
- ko
- de
- it
- pt
- ru
- zh
- ja
---
# Synthetic Generated Free Spoken Digit Dataset
*This dataset is Generated by [Voxlab](https://voxlab.netlify.app/)*
#### Context
This dataset is generated by a Text to Speech Models (TTS). It contains spoken digits from 0 to 9.
This is a free to use for research as well as for commercial use.
Dataset contains simple audio files consisting of recordings of spoken digits in wav file.
### Current Status
**Languages** : 10
(English - en, Spanish - es, French - fr, Korean - ko, German - de, Italian - it, Portuguese - pt, Russian - ru, Chinese - zh-cn, Japanese - ja)
**Speakers** : for each language there is only 1 speaker
**No of audio files** : 5000 audio files (50 for each digit)
**Pronunciations** : This dataset contains pronunciations in all 10 languages
### Dataset File structure
Files are named in the following format: *{digitLabel}-{language}-{speakerGender}-{index}.wav*
*Example: seven-en-F-56.wav*
### How to use
```
from datasets import load_dataset
dataset = load_dataset("Voxlab/synthetic-generated-free-spoken-digit-dataset")
```
#### Inspiration
This dataset was inspired from free spoken digit dataset (https://www.kaggle.com/datasets/joserzapata/free-spoken-digit-dataset-fsdd)
Explore similar datasets generated by [Voxlab](voxlab.netlify.app)
https://github.com/synthetic-data-platform/Free-Synthetic-Datasets
提供机构:
Voxlab
原始信息汇总
Synthetic Generated Free Spoken Digit Dataset
概述
该数据集由Text to Speech Models (TTS)生成,包含从0到9的口语数字。数据集可免费用于研究和商业用途。
数据集详情
语言
- 包含10种语言:英语(en)、西班牙语(es)、法语(fr)、韩语(ko)、德语(de)、意大利语(it)、葡萄牙语(pt)、俄语(ru)、中文(zh-cn)、日语(ja)。
发音者
- 每种语言只有一个发音者。
音频文件数量
- 总共包含5000个音频文件,每个数字有50个音频文件。
发音
- 数据集包含所有10种语言的发音。
文件结构
音频文件命名格式为:{digitLabel}-{language}-{speakerGender}-{index}.wav
示例
seven-en-F-56.wav
使用方法
python from datasets import load_dataset
dataset = load_dataset("Voxlab/synthetic-generated-free-spoken-digit-dataset")



