five

Bangla Online Comments Dataset

收藏
arXiv2021-02-04 更新2024-06-21 收录
下载链接:
https://data.mendeley.com/datasets/9xjx8twk8p
下载链接
链接失效反馈
官方服务:
资源简介:
Bangla Online Comments Dataset是由布拉克大学计算机科学与工程系创建的,旨在通过分析孟加拉语在线评论来识别和预防网络骚扰。该数据集包含44001条来自Facebook公众人物帖子的评论,这些评论被标记为不同的骚扰类别,如性骚扰、威胁、恶搞和宗教骚扰等。数据集的创建过程涉及从特定帖子中抓取评论,过滤和去重,然后根据内容进行分类。该数据集主要用于训练机器学习模型,以自动识别和分类孟加拉语网络骚扰,帮助维护一个安全的在线环境。

Bangla Online Comments Dataset was developed by the Department of Computer Science and Engineering, Brac University, with the goal of identifying and preventing online harassment through the analysis of Bengali online comments. The dataset comprises 44,001 comments collected from posts by public figures on Facebook, which have been annotated with multiple harassment categories including sexual harassment, threats, trolling, and religious harassment, among others. The dataset creation workflow includes scraping comments from targeted posts, filtering and deduplicating the acquired data, followed by content-based classification. This dataset is primarily utilized to train machine learning models for the automatic identification and classification of Bengali online harassment, thereby assisting in the maintenance of a secure online environment.
提供机构:
布拉克大学计算机科学与工程系
创建时间:
2021-02-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作