RidheshBhati/Complete_Data_Source_100K_HOURS
收藏Hugging Face2026-04-28 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/RidheshBhati/Complete_Data_Source_100K_HOURS
下载链接
链接失效反馈官方服务:
资源简介:
多语言音频收集(100K小时):包含多种语言配置的音频数据集,每种配置包含parquet格式的数据文件,数据集特征包括音频和转录文本。数据集标记为音频、语音和自动语音识别任务,总名称为完整数据源(100K小时)。所有数据文件夹(包括溢出文件夹)均在此整合,音频列通过Hugging Face音频功能设置为WAV格式。
Multi-Language Audio Collection (100K Hours): An audio dataset with configurations for various languages, each including data files in parquet format and dataset features consisting of audio and transcript. The dataset is tagged for audio, speech, and automatic-speech-recognition tasks, with a pretty name Complete Data Source (100K Hours). All data folders including overflow are consolidated here, and the audio column is now set to WAV format via the Hugging Face Audio feature.
提供机构:
RidheshBhati



