sartifyllc/Sukuma-Voices

Name: sartifyllc/Sukuma-Voices
Creator: sartifyllc
Published: 2026-02-05 12:30:55
License: 暂无描述

Hugging Face2026-02-05 更新2026-02-07 收录

下载链接：

https://hf-mirror.com/datasets/sartifyllc/Sukuma-Voices

下载链接

链接失效反馈

官方服务：

资源简介：

Sukuma Voices是首个公开可用的Sukuma（Kisukuma）语音语料库，Sukuma是一种在坦桑尼亚北部约有1000万人使用的班图语。该数据集填补了非洲资源严重不足语言在语音技术资源方面的关键空白。数据集包含7989个样本，总时长为23.44小时，平均时长为10.56±4.32秒，总词汇量为168,022个，独特词汇量为24,398个。数据集支持自动语音识别（ASR）、文本到语音（TTS）以及跨语言语音处理等任务。数据来源于Sukuma新约2000翻译的音频录音和文本转录，确保了转录的一致性和文化相关性。数据集还提供了基线结果，展示了Whisper Large V3和Wav2Vec2-large-XLSR-53架构在ASR任务中的性能。

Sukuma Voices is the first publicly available speech corpus for Sukuma (Kisukuma), a Bantu language spoken by approximately 10 million people in northern Tanzania. This dataset addresses the critical gap in speech technology resources for one of Africas most severely under-resourced languages. The dataset contains 7,989 samples with a total duration of 23.44 hours, an average duration of 10.56 ± 4.32 seconds, a total word count of 168,022, and a unique vocabulary of 24,398. It supports tasks such as Automatic Speech Recognition (ASR), Text-to-Speech (TTS), and cross-lingual speech processing. The data is sourced from audio recordings and textual transcriptions of the Sukuma New Testament 2000 translation, ensuring transcription consistency and cultural relevance. The dataset also provides baseline results, showcasing the performance of Whisper Large V3 and Wav2Vec2-large-XLSR-53 architectures in ASR tasks.

提供机构：

sartifyllc

5,000+

优质数据集

54 个

任务类型

进入经典数据集