five

Advancing Bengali NLP for Sentiment and Emotion Dataset

收藏
Mendeley Data2026-04-09 收录
下载链接:
https://data.mendeley.com/datasets/kztpv8g89p/1
下载链接
链接失效反馈
官方服务:
资源简介:
The dataset consists of 34,812 Bengali posts and comments sourced from Facebook, Twitter, and Instagram, Bengali news portals and literature. Techniques employed in data acquisition included data scraping from social media accounts through API and scraping only text data from websites. Microblogs consist of posts and comments from platforms like Facebook, Twitter, and Instagram, which allow for the capture of informal and emotionally rich text. Newspaper and magazine articles provide formal, sentiment-related information through opinions. Online literature, including Bengali novels, poems, and blogs, incorporates semantic relationships and linguistic nuances. Text data is collected from public sources through automated scripts. We used selenium scripts, created using the Python programming language. We used APIs to obtain structured social media data. Additionally, we complied with the requirements of privacy, data collection, and ethics.It contains 5 Emotion and 5 Sentiment class. For emotion "Creepy" being the most frequent emotion with 12,000 entries, followed by "Unbiased" with 8,500 entries, "Joyful" with 7,500 entries, "Bullying" with 4,000 entries, and "Surprise" with 2,500 entries. On the other hand, for sentiment "Negative" being the most frequent with 8,000 entries, followed by "Neutral" with 7,000 entries, "Strongly Negative" with 6,800 entries, "Positive" with 5,500 entries, and "Strongly Positive" with 4,500 entries in that order.
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作