HiTZ/benchmark_eseu_testsets
收藏Hugging Face2025-04-19 更新2025-05-31 收录
下载链接:
https://hf-mirror.com/datasets/HiTZ/benchmark_eseu_testsets
下载链接
链接失效反馈官方服务:
资源简介:
这是一个用于自动语音识别任务的测试数据集,包含西班牙语和巴斯克语两种语言。数据集是公开可获取数据集的缩减版本,每个数据集中的小时数大致相同,以便进行公平的评价。数据集包括来自不同源的小型测试集,例如Common Voice、OpenSLR、Multilingual Librispeech和VoxPopuli等,以及巴斯克议会的西班牙语、巴斯克语和双语测试集。总共有11.89小时的数据和5737个句子。
This is a test dataset for automatic speech recognition tasks, containing both Spanish and Basque languages. The dataset is a reduced version of publicly available datasets, with each dataset having roughly the same amount of hours for fair evaluation. It includes small test sets from different sources such as Common Voice, OpenSLR, Multilingual Librispeech, VoxPopuli, and the Basque Parliament in Spanish, Basque, and bilingual test sets. There are a total of 11.89 hours of data and 5737 sentences.
提供机构:
HiTZ



