soketlabs/CoSHE-Eval
收藏Hugging Face2026-01-16 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/soketlabs/CoSHE-Eval
下载链接
链接失效反馈官方服务:
资源简介:
CoSHE-Eval是一个用于测试自动语音识别(ASR)系统在印地语-英语混合语音上的评估数据集。它专注于印度常见的双语对话环境,其中印地语(使用天城文)和英语(使用拉丁字母)在同一话语中自然共存。数据集包含1985个样本,总时长约30小时,音频文件格式为.wav,并且与转录文本完美对齐。数据集还提供了详细的技术规格和使用示例,用于定量评估ASR模型的准确性。
CoSHE-Eval is an evaluation dataset curated for testing Automatic Speech Recognition (ASR) systems on Hindi-English code-mixed speech. It focuses on bilingual conversational contexts commonly found in India, where Hindi (in Devanagari) and English (in Latin script) co-occur naturally within the same utterance. The dataset contains 1985 samples with a total duration of approximately 30 hours. Audio files are provided in .wav format and are perfectly aligned with their corresponding transcriptions. The dataset also includes detailed technical specifications and usage examples for quantitatively assessing ASR model accuracy.
提供机构:
soketlabs



