five

UGSpeechData: A Multilingual Speech Dataset of Ghanaian Languages

收藏
科学数据银行2025-10-15 更新2026-04-23 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=bbd6baee3acf43bbbc4fe25e21077c8a
下载链接
链接失效反馈
官方服务:
资源简介:
The UGSpeechData is a collection of audio speech data of Akan, Ewe, Dagaare, Dagbani, and Ikposo. These languages are among the most spoken languages in Ghana. The uploaded dataset contains a total of 970148 audio files (5384.28 hours) and 93262 transcribed audio files (518 hours). The audio files are descriptions of 1000 culturally relevant images collected from indigenous speakers of each of the languages. Each audio is between 15 to 30 seconds long. More specifically, the dataset contains five subfolders for each of the five respective languages. Each language has at least 1000 hours of speech data and 100 hours of transcribed speech data. Fig. 1 provides details of the transcribed audio corpus, including gender and recording environments for each language.Fig. 1. Details of transcribed audio files
提供机构:
University of Ghana
创建时间:
2025-03-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作