five

BanglaEmotion: A Benchmark Dataset for Bangla Textual Emotion Analysis

收藏
Mendeley Data2024-03-27 更新2024-06-26 收录
下载链接:
https://data.mendeley.com/datasets/24xd7w7dhp
下载链接
链接失效反馈
官方服务:
资源简介:
We present a manually annotated Bangla Emotion corpus, which incorporates the diversity of fine-grained emotion expressions in social-media text. We tried to consider more fine-grained emotion labels such as Sadness, Happiness, Disgust, Surprise, Fear and Anger - which are, according to Paul Ekman (1999), the six basic emotion categories. For this task, we collected a large amount of raw text data from the user’s comments on two different Facebook groups (Ekattor TV and Airport Magistrates) and from the public post of a popular blogger and activist Dr. Imran H Sarker. These comments are mostly reactions to ongoing socio-political issues and towards the economic success and failure of Bangladesh. We scrape a total of 32923 comments from the three sources aforementioned above. Out of these, a total of 6314 comments were annotated into the six categories. The distribution of the annotated corpus is as follows: sad = 1341 happy = 1908 disgust = 703 surprise = 562 fear = 384 angry = 1416 We have also provided a balanced set from the above data and split the dataset into training and test set of equal ratio. We considered a proportion of 5:1 for training and evaluation purpose. More information on the dataset and the experiments on it could be found in our paper (related links below).
创建时间:
2024-01-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作