BANGLABOOK
收藏arXiv2023-06-08 更新2024-06-21 收录
下载链接:
https://github.com/mohsinulkabir14/BanglaBook
下载链接
链接失效反馈官方服务:
资源简介:
BANGLABOOK是由伊斯兰科技大学计算机科学与工程系系统与软件实验室创建的大规模孟加拉语数据集,包含158,065条来自在线书店的书评样本,分为正、负、中三个情感类别。数据集通过网络爬虫从Rokomari和Wafilife两个在线书店收集,并经过人工标注和翻译确保准确性。该数据集主要用于解决孟加拉语情感分析领域的数据稀缺问题,帮助企业和研究者更好地理解和利用孟加拉语用户的情感反馈。
BANGLABOOK is a large-scale Bengali dataset created by the System and Software Laboratory, Department of Computer Science and Engineering, Islamic University of Technology. It contains 158,065 book review samples from online bookstores, categorized into three sentiment classes: positive, negative and neutral. The dataset was collected via web crawling from two online bookstores, Rokomari and Wafilife, and underwent manual annotation and translation to ensure accuracy. This dataset is primarily developed to address the data scarcity issue in the field of Bengali sentiment analysis, enabling enterprises and researchers to better understand and leverage the sentiment feedback from Bengali-speaking users.
提供机构:
伊斯兰科技大学计算机科学与工程系系统与软件实验室
创建时间:
2023-05-11



