vanh10101/common_voice_16_1_vi_pseudo_labelled

Name: vanh10101/common_voice_16_1_vi_pseudo_labelled
Creator: vanh10101
Published: 2025-04-04 09:51:46
License: 暂无描述

Hugging Face2025-04-04 更新2025-04-12 收录

下载链接：

https://hf-mirror.com/datasets/vanh10101/common_voice_16_1_vi_pseudo_labelled

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含音频文件路径、音频数据（采样率为16000Hz）、文本句子、是否基于前一个条件以及 Whisper 转录文本等字段。数据集分为训练集、验证集和测试集，其中训练集包含388个示例，验证集包含58个示例，测试集包含182个示例。数据集总大小约为556MB。

The dataset includes fields such as audio file path, audio data (with a sampling rate of 16000Hz), text sentences, condition based on the previous one, and Whisper transcript text. The dataset is divided into training, validation, and test sets, with the training set containing 388 examples, the validation set containing 58 examples, and the test set containing 182 examples. The total size of the dataset is approximately 556MB.

提供机构：

vanh10101

5,000+

优质数据集

54 个

任务类型

进入经典数据集