CSIR SAMA Speech Corpus Manual Datasets
收藏DataCite Commons2026-03-23 更新2026-04-25 收录
下载链接:
https://dataspace.csir.co.za/articles/dataset/CSIR_SAMA_Speech_Corpus_Manual_Datasets/31832659/2
下载链接
链接失效反馈官方服务:
资源简介:
The evaluation corpus contains orthographically transcribed broadband speech in Afrikaans, isiXhosa, isiZulu, Sepedi, Sesotho, Tshivenḓa; all part of South Africa’s eleven official written languages. The audio was harvested as MP3 podcasts and automatically segmented and transcribed. Segment transcriptions are provided in XML format.The data is housed in the SADiLaR Language Resource Repository and can be accessed and downloaded via https://hdl.handle.net/20.500.12185/689
提供机构:
Council for Scientific and Industrial Research (CSIR)
创建时间:
2026-03-23



