five

Punjabi Speech: A labeled Speech Corpus

收藏
Mendeley Data2026-04-09 收录
下载链接:
https://data.mendeley.com/datasets/sdbc8f5b77
下载链接
链接失效反馈
官方服务:
资源简介:
The Punjabi Speech corpus is designed for automatic speech recognition and synthesis purposes. The corpus comprises recorded speech samples in the studio and open environment settings, with a sampling rate of 44.1 kHz in WAV file format. The duration of each recording is limited to 15 seconds to prevent memory issues while training on GPUs. The dataset currently contains 2429 spoken utterances from two male speakers, totaling ~4 hours of data. For training, validation, and testing purposes, the data is pre-divided into 80% for training, 10% for validation, and 10% for testing. The dataset is organized in a straightforward manner, with all speech files located in the "clips" directory and transcript files (train, dev, and test) in TSV format located in the parent directory. Each line in the transcript files represents a label for a single speech sample in the clips directory. The first column contains the path/name to the corresponding WAV file and the second column, separated by a tab, contains the transcript in text form.
提供机构:
Satwinder Singh
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作