five

Afaan Oromoo Text-to-Speech Dataset

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://data.mendeley.com/datasets/hnvkvj589y
下载链接
链接失效反馈
官方服务:
资源简介:
Afaan Oromo is one of the languages that have huge speakers in the horn of Africa. It is also one of the under-resourced languages like other Ethiopian languages. In this Dataset preparation, the soul purpose of the project was to include Afaan Oromo text-to-speech synthesis in our Final year Humanoid robot that can speak the Oromo language in addition to its vision capability to detect emotion, gender, and detect faces of humans. Currently, the natural language processing applications that use this language are in high demand. Furthermore, the linguists and researchers that work on these languages are contributing a lot of data to the growth of this language to make it an international and machine language. This dataset has been started by two Electrical Engineering students at Madda Walabu University, during their 4 months industry internship program at iCog-Labs, while working on machine learning tasks related to Natural language processing. The first phase of the audio was recorded at Addis Ababa University by the two students and the preprocessing was also done by them. The second phase of the audio recording was done at Madda Walabu university after their internship was completed. They have selected female students from the Electrical engineering and Afaan Oromo department to record the audio to get more corpus to train the machine learning model. The model used was Tacotron 1 and 2 with waveGlow and TensorFlow framework that was developed by NVIDIA company. The Corpus statistics Total clips: 1,224 Total Words: 17,559 Total characters: 116,439 Total Duration: 03:11:13 Min clip length: 1 sec max clip length: 59 sec Unique words: 5,040 Credits The volunteers that participated in the audio recording need to be appreciated and they were concerned about their language. They are Obsinet Asmare Motuma, Milko Wariyo Gobana, Roza Hailu Isho, Demitu Baye Boyosa.
创建时间:
2022-04-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作