BiLabel-cyberbullying dataset for the Kurdish Langauge
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://data.mendeley.com/datasets/kj4c7k6xwv
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains approximately 5,000 tweets manually annotated as either cyberbullying or non-cyberbullying content. The tweets were collected from the Twitter platform and annotated by trained human labelers to determine if each tweet contained language or behavior characteristic of cyberbullying based on a standardized cyberbullying definition and guidelines.
The dataset is balanced, containing roughly 2,500 cyberbullying tweets and 2,500 non-cyberbullying tweets. Each data point contains the full text of the original tweet along with its binary label indicating cyberbullying (1) or non-cyberbullying (0) status.
This dataset can be used for supervised machine learning tasks such as text classification to automatically identify cyberbullying content on social media. It provides a valuable resource for researchers and organizations working to detect and mitigate online harassment, abuse, and bullying. Potential applications include developing monitoring tools, improving content moderation, and supporting data-driven policy decisions around cyberbullying.
创建时间:
2024-02-12



