St. Lawrence Island Yupik数字语料库
收藏arXiv2021-01-26 更新2024-06-21 收录
下载链接:
https://github.com/SaintLawrenceIslandYupik/digital_corpus
下载链接
链接失效反馈官方服务:
资源简介:
St. Lawrence Island Yupik数字语料库是由伊利诺伊大学香槟分校语言学系创建的,旨在数字化和保存St. Lawrence Island Yupik语言的书面文本。该数据集包含90个Yupik语言文本,涵盖了从初级读物到宗教文本的广泛内容。创建过程中,研究团队采用了扫描、图像处理和光学字符识别等技术。数据集的应用领域包括语言学研究、自然语言处理以及Yupik语言的教育和复兴,旨在解决Yupik语言濒危和教育资源不足的问题。
The St. Lawrence Island Yupik Digital Corpus was created by the Department of Linguistics, University of Illinois Urbana-Champaign, with the aim of digitizing and preserving written texts in the St. Lawrence Island Yupik language. This dataset contains 90 Yupik language texts, covering a wide range of content from beginner reading materials to religious texts. During the creation process, the research team adopted technologies such as scanning, image processing, and optical character recognition (OCR). The application fields of this dataset include linguistic research, natural language processing, as well as the education and revitalization of the Yupik language, aiming to address the issues of the Yupik language being endangered and the shortage of educational resources.
提供机构:
伊利诺伊大学香槟分校语言学系
创建时间:
2021-01-26



