dmnph/common_voice_16_1_hi_pseudo_labelled

Name: dmnph/common_voice_16_1_hi_pseudo_labelled
Creator: dmnph
Published: 2025-03-05 13:54:27
License: 暂无描述

Hugging Face2025-03-05 更新2025-04-12 收录

下载链接：

https://hf-mirror.com/datasets/dmnph/common_voice_16_1_hi_pseudo_labelled

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含音频文件及其相关信息，特征包括音频文件的路径、音频数据本身、对应的文本句子、是否依赖于前一个条件的标记以及 Whisper 转录的文本。数据集分为训练集、验证集和测试集，其中训练集包含230,869个示例，验证集包含3,680个示例，测试集包含3,653个示例。

The dataset consists of audio files and associated information, including the path to the audio file, the audio data itself, the corresponding text sentence, a marker indicating whether it depends on the previous condition, and the Whisper-transcribed text. The dataset is split into a training set, a validation set, and a test set, with the training set containing 230,869 examples, the validation set containing 3,680 examples, and the test set containing 3,653 examples.

提供机构：

dmnph

5,000+

优质数据集

54 个

任务类型

进入经典数据集