Mburisano Covid-19 multilingual corpus
收藏Figshare2026-03-23 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Mburisano_Covid-19_multilingual_corpus/31833979
下载链接
链接失效反馈官方服务:
资源简介:
This corpus was created to aid development of the AwezaMed Covid-19 speech-to-speech mobile application. The project within which it was created, Mburisano, was funded by the Department of Sport, Arts and Culture (DSAC). A selection of English sentences was generated in consultation with medical domain experts, and these sentences were manually translated into all official South African languages. The sentences formed the basis of the rapid development of Grammatical Framework (GF) application grammars for all the languages, to aid spoken communication about Covid-19 with a particular focus on screening and triage. The corpus is presented as a limited domain, manually translated parallel corpus in all 11 official South African languages. The AwezaMed Covid-19 application can be found at: https://play.google.com/store/apps/details?id=za.co.aweza.covid19&gl=ZA.The dataset (corpus) is not housed in this repository and can be accessed and downloaded via the SADiLaR Language Resource Repository: https://hdl.handle.net/20.500.12185/536.
创建时间:
2026-03-23



