UGSpeechData: A Multilingual Speech Dataset of Ghanaian Languages
收藏科学数据银行2025-10-15 更新2026-04-23 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=bbd6baee3acf43bbbc4fe25e21077c8a
下载链接
链接失效反馈官方服务:
资源简介:
The UGSpeechData is a collection of audio speech data of Akan, Ewe, Dagaare, Dagbani, and Ikposo. These languages are among the most spoken languages in Ghana. The uploaded dataset contains a total of 970148 audio files (5384.28 hours) and 93262 transcribed audio files (518 hours). The audio files are descriptions of 1000 culturally relevant images collected from indigenous speakers of each of the languages. Each audio is between 15 to 30 seconds long. More specifically, the dataset contains five subfolders for each of the five respective languages. Each language has at least 1000 hours of speech data and 100 hours of transcribed speech data. Fig. 1 provides details of the transcribed audio corpus, including gender and recording environments for each language.Fig. 1. Details of transcribed audio files
提供机构:
University of Ghana
创建时间:
2025-03-17



