ayush-shunyalabs/Indic_ASR_Eval
收藏Hugging Face2026-04-23 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/ayush-shunyalabs/Indic_ASR_Eval
下载链接
链接失效反馈官方服务:
资源简介:
Indic ASR Eval是一个针对印度语言自动语音识别的精选评估集。该数据集从七个公开的印度语言ASR语料库中,每个(源数据集×语言)单元抽取100个样本(种子=42)。每个源语料库作为其自己的数据集配置发布,包含一个单一的`test`分割,采样率为16 kHz。数据集总共有6,169行,约13.3小时的音频,覆盖了包括孟加拉语、博杰普尔语、古吉拉特语、印地语、卡纳达语、马拉雅拉姆语、马拉地语、奥里亚语、旁遮普语、梵语、泰米尔语、泰卢固语和乌尔都语在内的16种语言。每个配置的详细信息包括行数和备注,如`kathbath`(1,200行,AI4Bharat Kathbath,12种语言的朗读语音)、`kathbath_noisy`(1,200行,Kathbath的噪声变体)等。数据集的模式包括音频(16 kHz单声道波形)、转录文本(原生脚本)、语言(标题格式的语言名称)、持续时间(秒)和数据集(源数据集ID)。
Indic ASR Eval is a curated evaluation set for Indic-language automatic speech recognition. It consists of 100 samples sampled (seed = 42) from each (source dataset × language) cell of seven public Indic ASR corpora. Each source corpus is published as its own dataset config with a single `test` split, at 16 kHz. The dataset contains 6,169 rows in total, with approximately 13.3 hours of audio, covering 16 languages including Bengali, Bhojpuri, Gujarati, Hindi, Kannada, Malayalam, Marathi, Odia, Punjabi, Sanskrit, Tamil, Telugu, and Urdu. Detailed information for each config includes the number of rows and notes, such as `kathbath` (1,200 rows, AI4Bharat Kathbath, read speech in 12 languages), `kathbath_noisy` (1,200 rows, noisy variant of Kathbath), etc. The schema of the dataset includes audio (mono waveform at 16 kHz), transcript (reference transcription in native script), language (title-case language name), duration (clip duration in seconds), and dataset (source dataset id matching the config name).
提供机构:
ayush-shunyalabs



