BanglaEmotion: A Benchmark Dataset for Bangla Textual Emotion Analysis
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://data.mendeley.com/datasets/24xd7w7dhp
下载链接
链接失效反馈官方服务:
资源简介:
We present a manually annotated Bangla Emotion corpus, which incorporates the diversity of fine-grained emotion expressions in social-media text. We tried to consider more fine-grained emotion labels such as Sadness, Happiness, Disgust, Surprise, Fear and Anger - which are, according to Paul Ekman (1999), the six basic emotion categories. For this task, we collected a large amount of raw text data from the user’s comments on two different Facebook groups (Ekattor TV and Airport Magistrates) and from the public post of a popular blogger and activist Dr. Imran H Sarker. These comments are mostly reactions to ongoing socio-political issues and towards the economic success and failure of Bangladesh. We scrape a total of 32923 comments from the three sources aforementioned above. Out of these, a total of 6314 comments were annotated into the six categories. The distribution of the annotated corpus is as follows:
sad = 1341
happy = 1908
disgust = 703
surprise = 562
fear = 384
angry = 1416
We have also provided a balanced set from the above data and split the dataset into training and test set of equal ratio. We considered a proportion of 5:1 for training and evaluation purpose. More information on the dataset and the experiments on it could be found in our paper (related links below).
创建时间:
2020-11-20



