Kurdish (Sorani) POS-tagged lexicon
收藏arXiv2022-01-30 更新2024-06-21 收录
下载链接:
https://kurdishblark.github.io/
下载链接
链接失效反馈官方服务:
资源简介:
本数据集由库尔德斯坦大学的Hossein Hassani开发,旨在通过波斯语的Bijankhan语料库来创建库尔德语(索拉尼)的词性标注词典。数据集包含13,294条记录,通过机器翻译和人工评估相结合的方式生成。该数据集的应用领域主要集中在自然语言处理,特别是词性标注,旨在解决库尔德语资源稀缺的问题,并促进库尔德语在计算语言学领域的发展。
This dataset was developed by Hossein Hassani from the University of Kurdistan, aiming to create a part-of-speech tagging lexicon for Kurdish (Sorani) using the Bijankhan Corpus in Persian. The dataset contains 13,294 records, which were generated through a combination of machine translation and manual evaluation. Its application scenarios are mainly focused on natural language processing, particularly part-of-speech tagging, aiming to address the scarcity of Kurdish language resources and promote the development of Kurdish in the field of computational linguistics.
提供机构:
库尔德斯坦大学
创建时间:
2022-01-30



