five

KenTrans: A Parallel Corpora for Swahili and local Kenyan Languages

收藏
DataONE2022-10-19 更新2024-06-08 收录
下载链接:
https://search.dataone.org/view/sha256:cd58f9e50febeac0a70d1e47e23b8b6a6237660b50d2c2c5a1515397bd8b20bb
下载链接
链接失效反馈
官方服务:
资源简介:
This project produced a parallel corpus between Swahili and 2 other Kenya Languages: Dholuo and Luhya. The Luhya Language has several dialects. In the project 3 dialects were chosen as a start: Lumarachi, Logooli and Lubukusi. A total of 12, 400 sentences were translated to Kiswahili from a sample of Dholuo, Luhya texts (1500 Dholuo-Kiswahili sentence pairs and 10,900 Luhya-Kiswahili sentence pairs). Each document contains sentence pairs, the sentence in the original language starts with letter “O” followed by a full colon (“O:”) while the translated Kiswahili sentence below it starts with letter “T” followed by a full colon (“T:”). Acknowledgement of translators: Luo - Swahili: Mercy Lavinca Oduoll (Coordinator), Bildad Okebe, Immaculate Ochieng, Mary Muma Luhyia (Logooli) - Swahili: Phillip Lumwamu (Coordinator), Kints Mugoha Musungu, Vivian Alivitsa, Joseph Ambwere, Joyline Ingasiani Luhyia (Bukusu) - Swahili: Martin Barasa Mulwale (Coordinator), Samwel Ralph Nyongesa, Tobias Shikuku, Phelisters N Simiyu Luhyia (Marachi) - Swahili: Judith Awinja (Coordinator), Evans Owino, Belinda Oduor, Frankline Mwaro
创建时间:
2023-11-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作