five

simbolo-ai/burmese-hatespeech

收藏
Hugging Face2024-09-27 更新2025-04-19 收录
下载链接:
https://hf-mirror.com/datasets/simbolo-ai/burmese-hatespeech
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - my license: gpl --- ### What is Hate Speech? Hate speech is any toxic communication used to attack individuals or groups directly, especially based on the characteristics (but not limited to): physical deficiency, mental deficiency, moral deficiency, age, ethnicity, race, national origin, caste, religion, disability, serious disease, sex, gender, gender identity, gender reassignment, sexual orientation, and immigration status. ### About Dataset The dataset comprises text data collected from Facebook comments and posts, primarily focusing on instances of hate speech. The data is in Burmese and includes a wide range of offensive and hateful expressions targeting individuals and groups based on various attributes such as ethnicity, politics, and personal characteristics.The dataset contains approximately 16.8k rows. This number reflects a diverse collection of hate speech instances, offering a comprehensive overview of the types of toxic language prevalent in the specific online community sampled. ### Disclaimer The dataset may contain toxic data like using rude words, and these are not aligned with the definition of hate speech data mentioned above. ### Contributors: Main Contributor: Sa Phyo Thu Htet Other Contributors: Ei Thandar Aung, Naing Linn Phyo, Yang Ni Linn Lat, Chaw Su Thwe Thiha Nyein, Hnin Aye Thant, Ye Bhone Lin Data Collectors: Sa Phyo Thu Htet, Students from Simbolo, Club Members of Data Science and Machine Learning Club, University of Technology, Yatanarpon Cyber City, Myanmar
提供机构:
simbolo-ai
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作