i4ds/spc_r
收藏Hugging Face2025-10-02 更新2025-10-18 收录
下载链接:
https://hf-mirror.com/datasets/i4ds/spc_r
下载链接
链接失效反馈官方服务:
资源简介:
瑞士议会 corpus 重构版(SPC_R)将瑞士德语议会演讲与标准德语转录配对,提供了大约751小时的高质量语音-文本数据,用于训练和评估自动语音识别(ASR)和语音翻译模型。该语料库通过使用现代的大型语言模型增强管道处理完整的会议记录,提高了转录的准确性,并提供了长篇语境。
The Swiss Parliaments Corpus Re-Imagined (SPC_R) pairs Swiss German parliamentary speech with Standard German transcriptions, yielding approximately 751 hours of high-quality speech-text data for training and evaluating automatic speech-recognition (ASR) and speech-translation models. The corpus extends the original Swiss Parliaments Corpus by processing full-length sessions with a modern, LLM-enhanced pipeline that boosts transcription accuracy and provides long-form context.
提供机构:
i4ds



