mpanda27/voxpopuli_it_pseudo_labelled

Name: mpanda27/voxpopuli_it_pseudo_labelled
Creator: mpanda27
Published: 2024-11-30 06:09:08
License: 暂无描述

Hugging Face2024-11-30 更新2024-12-14 收录

下载链接：

https://hf-mirror.com/datasets/mpanda27/voxpopuli_it_pseudo_labelled

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含音频ID、音频数据、标准化文本、条件序列和Whisper转录文本等特征。数据集分为训练集、验证集和测试集，分别包含不同数量的字节和示例。训练集包含12306个示例，验证集包含708个示例，测试集包含688个示例。

This dataset is a collection of audio and text data, primarily used for speech recognition tasks. It includes Italian audio data, with each sample containing an audio ID, audio data, normalized text, a condition indicating whether it depends on the previous sample, and a Whisper transcript. The dataset is divided into training, validation, and test sets, containing 12306, 708, and 688 samples respectively.

提供机构：

mpanda27

5,000+

优质数据集

54 个

任务类型

进入经典数据集