Ugiat/voxlingua107_IberLang
收藏Hugging Face2025-10-29 更新2025-11-01 收录
下载链接:
https://hf-mirror.com/datasets/Ugiat/voxlingua107_IberLang
下载链接
链接失效反馈官方服务:
资源简介:
IberVoice数据集是一个经过精心策划和验证的音频样本集合,包含西班牙语、加泰罗尼亚语、加利西亚语、巴斯克语和奥克语,旨在增强针对西班牙官方语言的口语识别(LID)系统。该数据集源自VoxLingua107多语言语料库,但经过了清洗和重新标注,以修正影响少数民族语言的系统性标签错误。
The IberVoice dataset is a curated and validated collection of audio samples in Spanish, Catalan, Galician, Basque and Occitan designed to enhance spoken language identification (LID) systems for the official languages of Spain. The dataset originates from the VoxLingua107 multilingual corpus, but has been cleaned and reannotated to correct systematic labeling errors affecting minority languages.
提供机构:
Ugiat



