Research on Classification of Kazakh Tourist Questions Combined With Multiple Linguistic Features
收藏科学数据银行2021-12-09 更新2026-04-23 收录
下载链接:
https://www.scidb.cn/en/detail?dataSetId=649658e24ffa4d64b18feb04de910ac0
下载链接
链接失效反馈官方服务:
资源简介:
Kazakh is an adhesive language with the rich vocabulary, and this characteristic causes data to be sparse to a certain extent. Additionally, Kazakh tourist questions have no strict grammatical rules, and the keywords in the text have unclear boundaries, nesting, and irregular expressions. Traditional BiGRU cannot extract comprehensive and effective text features, and OOV words cannot be fully recognized. Due to this, this article is based on the Kazakh tourism text Linguistic features proposes a question classification model that combines multiple linguistic features and attention mechanisms. As input, Kazakh words and linguistic features are used, and context information is modeled by the BiGRU layer, focusing on the input features, filtering useless information, and finally completing classification with Softmax. As there is no extensive data set in the tourism area, a set of Kazakh questions was constructed to fill the gap. This study demonstrates that the model proposed here successfully combines linguistic and domain features, avoids the problem of sparse data, and performs better when it comes to the classification of Kazakh tourism questions.
提供机构:
Gulizada HAISA; Gulila ALTENBEK
创建时间:
2021-12-08



