five

BiLabel-cyberbullying dataset for the Kurdish Langauge

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://data.mendeley.com/datasets/kj4c7k6xwv
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains approximately 5,000 tweets manually annotated as either cyberbullying or non-cyberbullying content. The tweets were collected from the Twitter platform and annotated by trained human labelers to determine if each tweet contained language or behavior characteristic of cyberbullying based on a standardized cyberbullying definition and guidelines. The dataset is balanced, containing roughly 2,500 cyberbullying tweets and 2,500 non-cyberbullying tweets. Each data point contains the full text of the original tweet along with its binary label indicating cyberbullying (1) or non-cyberbullying (0) status. This dataset can be used for supervised machine learning tasks such as text classification to automatically identify cyberbullying content on social media. It provides a valuable resource for researchers and organizations working to detect and mitigate online harassment, abuse, and bullying. Potential applications include developing monitoring tools, improving content moderation, and supporting data-driven policy decisions around cyberbullying.
创建时间:
2024-02-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作