ebellob/voxpopuli_spanish_enhanced
收藏Hugging Face2026-04-23 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/ebellob/voxpopuli_spanish_enhanced
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是facebook/voxpopuli数据集中西班牙语子集的经过处理和增强的版本,专为支持高质量的语音研究而设计,包括文本到语音(TTS)、自动语音识别(ASR)以及语音增强和鲁棒性研究。数据集通过CleanUNet进行语音去噪,并通过FlashSR进行音频超分辨率处理,以提高音频质量。数据集包含50,922个样本,大小约为105 GB,所有音频均为西班牙卡斯蒂利亚(半岛西班牙语)方言,源自欧洲议会的全体会议录音。
This dataset is a processed and enhanced version of the Spanish subset of the facebook/voxpopuli dataset, designed to support high-quality speech research, including Text-to-Speech (TTS), Automatic Speech Recognition (ASR), and speech enhancement and robustness studies. The dataset has been denoised using CleanUNet and super-resolved using FlashSR to improve audio quality. It contains 50,922 samples with an approximate size of 105 GB, all in Castilian (Peninsular) Spanish dialect sourced from European Parliament plenary sessions.
提供机构:
ebellob



