five

xhosa

收藏
DataCite Commons2025-12-12 更新2025-04-16 收录
下载链接:
https://spraakbanken.gu.se/resurser/xhosa
下载链接
链接失效反馈
官方服务:
资源简介:
The Corpus of Spoken isiXhosa The Corpus of Spoken isiXhosa consists of transcribed and annotated recordings of spoken Xhosa [xho]. The recordings have been made in the Eastern Cape in South Africa from 2015 onwards. The transcribed texts are annotated with morpheme-by-morpheme glosses, part-of-speech tags, and free English translations. The recordings and the annotations of Xhosa data have been made as part of three different research projects led by senior lecturer Eva-Marie Bloom Ström at the University of Gothenburg. All projects, including the ongoing ‘How do words get in order? The role of speaker-hearer interaction in languages of southern Africa’, were founded by the Swedish Research Council. The Corpus has been developed in collaboration with Språkbanken Text. A user guide and more extensive information about the corpus data can be found in the Corpus of Spoken isiXhosa Manual [PDF]. For more on annotation, preparation of data, and acknowledgements see: Bloom Ström, E.-M., Slater, O., Zahran, A., Berdicevskis, A., & Schumacher, A. (2023). Preparing a corpus of spoken Xhosa. Proceedings of the 2023 CLASP Conference on Learning with Small Data (LSD), 62–67. https://aclanthology.org/2023.clasp-1.7 For questions about the corpus: Eva-Marie Bloom Ström eva-marie.strom@gu.se If you notice any errors or inconsistencies in annotations, please report them to this email address. Main contributors: Eva-Marie Bloom Ström Senior Lecturer, University of Gothenburg Onelisa Slater MA, Rhodes University Aron Zahran PhD, Inalco/Llacan (CNRS) & Ghent University

科萨语口语语料库 科萨语口语语料库包含科萨语(Xhosa [xho])口语录音的转写和标注文本。这些录音自2015年起在南非东开普省采集。转写文本包含语素级注释、词性标注以及英文意译。 科萨语数据的录音和标注工作是哥德堡大学高级讲师Eva-Marie Bloom Ström主持的三个不同研究项目的一部分。所有项目(包括正在进行的‘How do words get in order? The role of speaker-hearer interaction in languages of southern Africa’)均由瑞典研究委员会资助。 该语料库由与Språkbanken Text合作开发。 关于语料库数据的用户指南和更详细信息可在《科萨语口语语料库手册》[PDF]中找到。 关于标注、数据准备和致谢的更多信息,请参见:Bloom Ström, E.-M., Slater, O., Zahran, A., Berdicevskis, A., & Schumacher, A. (2023).《科萨语口语语料库构建研究》。《2023年CLASP小数据学习会议论文集》(LSD),62–67页。https://aclanthology.org/2023.clasp-1.7 关于语料库的问题请联系: Eva-Marie Bloom Ström eva-marie.strom@gu.se 若发现标注中的任何错误或不一致,请通过此邮箱地址报告。 主要贡献者: Eva-Marie Bloom Ström 哥德堡大学高级讲师 Onelisa Slater 罗德斯大学硕士 Aron Zahran 法国东方语言文化学院/Llacan(CNRS)及根特大学博士
提供机构:
Språkbanken Text
创建时间:
2024-11-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作