issai/Multilingual_Speech_Dataset
收藏Hugging Face2025-02-13 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/issai/Multilingual_Speech_Dataset
下载链接
链接失效反馈官方服务:
资源简介:
这是一个用于研究端到端自动语音识别(ASR)的多语种语音数据集。该数据集包含哈萨克语、俄语和英语三种语言。数据集中包括一个哈萨克语口音的英语评估集、一个来自CommonVoice的英语训练集和一个来自OpenSTT数据集的手动清洗的俄语子集。这个数据集用于比较单语种和多语种ASR方法的论文研究,并提供了学术引用。
This is a multilingual speech dataset for research on end-to-end automatic speech recognition (ASR). The dataset includes Kazakh, Russian, and English languages. It consists of an evaluation set of English with a Kazakh accent, a training set of English derived from CommonVoice, and a manually cleaned Russian subset from the OpenSTT dataset. The dataset is used in a paper that compares monolingual and multilingual ASR approaches and provides an academic citation for referencing.
提供机构:
issai



