BiLabel-cyberbullying dataset for the Kurdish Langauge

NIAID Data Ecosystem2026-05-01 收录

下载链接：

https://data.mendeley.com/datasets/kj4c7k6xwv

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset contains approximately 5,000 tweets manually annotated as either cyberbullying or non-cyberbullying content. The tweets were collected from the Twitter platform and annotated by trained human labelers to determine if each tweet contained language or behavior characteristic of cyberbullying based on a standardized cyberbullying definition and guidelines. The dataset is balanced, containing roughly 2,500 cyberbullying tweets and 2,500 non-cyberbullying tweets. Each data point contains the full text of the original tweet along with its binary label indicating cyberbullying (1) or non-cyberbullying (0) status. This dataset can be used for supervised machine learning tasks such as text classification to automatically identify cyberbullying content on social media. It provides a valuable resource for researchers and organizations working to detect and mitigate online harassment, abuse, and bullying. Potential applications include developing monitoring tools, improving content moderation, and supporting data-driven policy decisions around cyberbullying.

创建时间：

2024-02-12

5,000+

优质数据集

54 个

任务类型

进入经典数据集