Indonesia Instagram cyberbullying

Mendeley Data2026-04-18 收录

下载链接：

https://data.mendeley.com/datasets/xthb26ntc5

下载链接

链接失效反馈

官方服务：

资源简介：

"All" Dataset (Full Dataset) This is the primary and most comprehensive dataset containing all text samples (comments). Each text sample has a multi-label label covering all possible categories, including neutral text (not cyberbullying) and various types of cyberbullying (e.g., neutral, flaming, denigration, racism, etc.). Usage: This dataset is used in two scenarios: Scenario A (Single-Stage Multi-Label Classification): The "All" dataset is used directly to train a model to classify text into one or more categories simultaneously (e.g., a text can be labeled neutral only, or both flaming and racism simultaneously). Scenario B - Stage 1 (Binary Detection): This dataset is used to train a binary classification model. For this stage, the original multi-label labels are transformed into binary labels (Yes/No): The binary label is No (Not Cyberbullying): If the text label is neutral. Binary label value is Yes (Cyberbullying): If the text contains at least one cyberbullying type label (e.g., flaming, denigration, etc.). Dataset Cyberbullying (Derived Dataset) Definition: This is a subset of the "All Dataset." This dataset was created by filtering and sampling only text that was identified as cyberbullying (binary label value: Yes) in Scenario B - Phase 1. Characteristics: This dataset no longer contains text with a neutral label. It only contains texts guaranteed to contain cyberbullying, along with a multi-label label detailing the type of cyberbullying (e.g., flaming, denigration, etc.). Usage: Scenario B - Phase 2 (Cyberbullying Type Classification): This dataset is used exclusively to train the model in the second phase of Scenario B. The goal is to classify the type of cyberbullying from a text, once the text has been confirmed as cyberbullying by the Phase 1 model.

创建时间：

2025-11-11

5,000+

优质数据集

54 个

任务类型

进入经典数据集