five

Qugu Qiang texts and wordlists for NLP

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/6305611
下载链接
链接失效反馈
官方服务:
资源简介:
This is a collection of Qugu Qiang (Glottolog nort2722, ISO cng) vocabulary and texts for use in NLP. These materials are presented in the Qiang orthography: Rrmea Lehhrr.  The wordlist forms are from Zhou's 2010 dictionary of Qugu. Including example sentences, the dictionary contains approximately 26,000 entries. Of these entries, there were approximately 360 forms that were deemed to have typographical errors. These have been removed in the document titled 'Zhou_2010_cleaned.csv'. The full forms exactly as they were printed in Zhou 2010 are given in the file 'Zhou_2010_original.csv'.  The text 'Zhou_2010_text' is a short introduction to the dictionary project written in Rrmea Lehhrr. This collection also includes a set of traditional texts in Rrmea Lehhrr. The texts 'Huang_&_Zhou_2006' are from Huang & Zhou's 2006 descriptive grammar of Qugu with a set of annotated texts. These texts were converted from the international phonetic alphabet into the Qiang orthography. Additional texts come from the Chinese Qiang History and Culture volumes published in 2021, which contain some text in Rrmea Lehhrr. This is listed as 'Qiang_2021_text.txt' Sentences with greetings were taken from Huang, Zhou & Zhang 2014. This is a publication called '366 daily Qiang sentences'. Lastly, some sentences were taken from the Baidu Wiki page on Rrmea Lehhrr. https://baike.baidu.com/item/%E7%BE%8C%E6%96%87/830740
创建时间:
2022-03-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作