dmnph/common_voice_16_1_hi_pseudo_labelled
收藏Hugging Face2025-03-05 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/dmnph/common_voice_16_1_hi_pseudo_labelled
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含音频文件及其相关信息,特征包括音频文件的路径、音频数据本身、对应的文本句子、是否依赖于前一个条件的标记以及 Whisper 转录的文本。数据集分为训练集、验证集和测试集,其中训练集包含230,869个示例,验证集包含3,680个示例,测试集包含3,653个示例。
The dataset consists of audio files and associated information, including the path to the audio file, the audio data itself, the corresponding text sentence, a marker indicating whether it depends on the previous condition, and the Whisper-transcribed text. The dataset is split into a training set, a validation set, and a test set, with the training set containing 230,869 examples, the validation set containing 3,680 examples, and the test set containing 3,653 examples.
提供机构:
dmnph



