STT4SG-350
收藏arXiv2023-05-30 更新2024-06-21 收录
下载链接:
https://swissnlp.org/datasets/
下载链接
链接失效反馈官方服务:
资源简介:
STT4SG-350是由瑞士西北应用科学与艺术大学等机构创建的瑞士德语语音数据集,包含343小时的语音数据,覆盖所有瑞士德语方言区域。数据集通过网络应用收集,参与者将标准德语句子翻译为瑞士德语并录音。该数据集主要用于自动语音识别(ASR)、文本到语音转换、方言识别和说话人识别等领域,旨在解决瑞士德语在自动化处理中的挑战,特别是由于方言多样性和缺乏标准拼写法导致的问题。
STT4SG-350 is a Swiss German speech dataset developed by the University of Applied Sciences Northwestern Switzerland and other affiliated institutions. It encompasses 343 hours of speech data spanning all Swiss German dialect regions. The dataset was collected through a web-based application, wherein participants translated standard German sentences into Swiss German and recorded their speech. This dataset is primarily utilized for tasks including automatic speech recognition (ASR), text-to-speech synthesis, dialect identification, and speaker recognition. Its core goal is to address the challenges in automated processing of Swiss German, particularly those caused by dialect diversity and the lack of a standardized orthography.
提供机构:
瑞士西北应用科学与艺术大学
创建时间:
2023-05-30



