DGurgurov/georgian_sa

Name: DGurgurov/georgian_sa
Creator: DGurgurov
Published: 2024-05-30 11:08:32
License: 暂无描述

Hugging Face2024-05-30 更新2024-06-12 收录

下载链接：

https://hf-mirror.com/datasets/DGurgurov/georgian_sa

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含一个用于情感分析的数据集，来源于Stefanovitch等人（2022）的研究。该数据集用于改进低资源语言的词嵌入项目。数据集包括格鲁吉亚语的标注数据，支持3标签（正面、中性、负面）和4标签（正面、中性、负面、混合）的情感分类。实验包括基于词典和机器学习模型的情感分类，探索了逻辑回归、支持向量机和基于Transformer的模型，并研究了迁移学习和翻译方法。

提供机构：

DGurgurov

原始信息汇总

格鲁吉亚语情感分析数据集

数据集描述： 该数据集包含Stefanovitch等人（2022）的情感分析数据。

数据结构： 该数据用于改进低资源语言的图知识词嵌入项目。

引用： bibtex @inproceedings{stefanovitch-etal-2022-resources, title = "Resources and Experiments on Sentiment Classification for {G}eorgian", author = "Stefanovitch, Nicolas and Piskorski, Jakub and Kharazi, Sopho", editor = "Calzolari, Nicoletta and B{e}chet, Fr{e}d{e}ric and Blache, Philippe and Choukri, Khalid and Cieri, Christopher and Declerck, Thierry and Goggi, Sara and Isahara, Hitoshi and Maegaard, Bente and Mariani, Joseph and Mazo, H{e}l{`e}ne and Odijk, Jan and Piperidis, Stelios", booktitle = "Proceedings of the Thirteenth Language Resources and Evaluation Conference", month = jun, year = "2022", address = "Marseille, France", publisher = "European Language Resources Association", url = "https://aclanthology.org/2022.lrec-1.173", pages = "1613--1621", abstract = "This paper presents, to the best of our knowledge, the first ever publicly available annotated dataset for sentiment classification and semantic polarity dictionary for Georgian. The characteristics of these resources and the process of their creation are described in detail. The results of various experiments on the performance of both lexicon- and machine learning-based models for Georgian sentiment classification are also reported. Both 3-label (positive, neutral, negative) and 4-label settings (same labels + mixed) are considered. The machine learning models explored include, i.a., logistic regression, SVMs, and transformed-based models. We also explore transfer learning- and translation-based (to a well-supported language) approaches. The obtained results for Georgian are on par with the state-of-the-art results in sentiment classification for well studied languages when using training data of comparable size.", }

5,000+

优质数据集

54 个

任务类型

进入经典数据集