five

Fine-Grained Balanced Cyberbullying Dataset

收藏
IEEE2020-11-12 更新2026-04-17 收录
下载链接:
https://ieee-dataport.org/open-access/fine-grained-balanced-cyberbullying-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
Amidst the COVID-19 pandemic, cyberbullying has become an even more serious threat. Our work aims to investigate the viability of an automatic multiclass cyberbullying detection model that is able to classify whether a cyberbully is targeting a victim’s age, ethnicity, gender, religion, or other quality. Previous literature has not yet explored making fine-grained cyberbullying classifications of such magnitude, and existing cyberbullying datasets suffer from quite severe class imbalances. To combat these challenges, we establish a framework for the automatic generation of balanced data by using a semi-supervised online Dynamic Query Expansion (DQE) process to extract more natural data points of a specific class from Twitter. We also propose a Graph Convolutional Network (GCN) classifier, using a graph constructed from the thresholded cosine similarities between tweet embeddings. With our DQE-augmented dataset, which we have made publicly available, we compare our GCN model using eight different tweet embedding methods and six other classification models over two sizes of datasets. Our results show that our proposed GCN model matches or exceeds the performance of the baseline models, as indicated by McNemar statistical tests.
提供机构:
Lu, Chang-Tien; Wang, Jason; Fu, Kaiqun
创建时间:
2020-11-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作