Expanded-SauDiSenti Lexicon + Corpus for Food Delivery Domain
收藏ieee-dataport.org2025-03-22 收录
下载链接:
https://ieee-dataport.org/documents/expanded-saudisenti-lexicon-corpus-food-delivery-domain
下载链接
链接失效反馈官方服务:
资源简介:
The major language used on social media platforms is primarily dialectal, posing unique challenges for Natural Language Processing. To address this, a large, manually annotated corpus of approximately 30,500 Saudi dialect tweets in the food delivery app domain was introduced. The corpus was annotated with positive, negative, and neutral sentiment categories. Additionally, the existing SauDiSenti lexicon was expanded by 30%, providing an improved resource for sentiment analysis in the Saudi dialect. the corpus and expanded lexicon have been evaluated using machine learning classifiers. This high-quality, domain-specific dataset and the expanded sentiment lexicon are expected to significantly advance Arabic sentiment analysis, particularly in the Saudi dialect and the food delivery industry.
社交媒体平台上的主要语言为方言,这为自然语言处理带来了独特的挑战。为应对此问题,引入了一个包含约30,500条沙特方言食品外卖应用程序领域的手动标注语料库。该语料库被标注为积极、消极和中性情感类别。此外,SauDiSenti情感词典被扩展了30%,为沙特方言的情感分析提供了更优的资源。该语料库和扩展的词典已通过机器学习分类器进行评估。这一高质量、特定领域的数据集以及扩展的情感词典预计将显著推进阿拉伯语情感分析,特别是在沙特方言和食品外卖行业。
提供机构:
IEEE Dataport



