sartifyllc/Sukuma-Voices-ACL
收藏Hugging Face2026-02-03 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/sartifyllc/Sukuma-Voices-ACL
下载链接
链接失效反馈官方服务:
资源简介:
Sukuma Voices数据集是首个公开可用的Sukuma(Kisukuma)语音语料库,支持语音到文本、文本到语音和语音评估任务。该数据集包含人类录音和TTS合成音频,共计4,343个样本,总时长为19.56小时。数据集结构包括训练集、测试集以及两种TTS生成的测试集。特征包括音频、文本、性别、声音标识、文件名和记录ID。数据集创建基于Sukuma新约圣经的录音和文本转录,并经过严格的注释和验证。
Sukuma Voices is the first publicly available speech corpus for Sukuma (Kisukuma), supporting speech-to-text, text-to-speech, and speech evaluation tasks. The dataset includes both human recordings and TTS-synthesized audio, totaling 4,343 samples with a duration of 19.56 hours. The dataset structure comprises training set, test set, and two TTS-generated test sets. Features include audio, text, gender, voice identifier, filename, and record ID. The dataset was created based on audio recordings and textual transcriptions of the Sukuma New Testament, with rigorous annotation and validation.
提供机构:
sartifyllc



