five

soketlabs/CoSHE-Eval

收藏
Hugging Face2026-01-16 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/soketlabs/CoSHE-Eval
下载链接
链接失效反馈
官方服务:
资源简介:
CoSHE-Eval是一个用于测试自动语音识别(ASR)系统在印地语-英语混合语音上的评估数据集。它专注于印度常见的双语对话环境,其中印地语(使用天城文)和英语(使用拉丁字母)在同一话语中自然共存。数据集包含1985个样本,总时长约30小时,音频文件格式为.wav,并且与转录文本完美对齐。数据集还提供了详细的技术规格和使用示例,用于定量评估ASR模型的准确性。

CoSHE-Eval is an evaluation dataset curated for testing Automatic Speech Recognition (ASR) systems on Hindi-English code-mixed speech. It focuses on bilingual conversational contexts commonly found in India, where Hindi (in Devanagari) and English (in Latin script) co-occur naturally within the same utterance. The dataset contains 1985 samples with a total duration of approximately 30 hours. Audio files are provided in .wav format and are perfectly aligned with their corresponding transcriptions. The dataset also includes detailed technical specifications and usage examples for quantitatively assessing ASR model accuracy.
提供机构:
soketlabs
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作