simbolo-ai/burmese-hatespeech
收藏Hugging Face2024-09-27 更新2025-04-19 收录
下载链接:
https://hf-mirror.com/datasets/simbolo-ai/burmese-hatespeech
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- my
license: gpl
---
### What is Hate Speech?
Hate speech is any toxic communication used to attack individuals or groups directly, especially based on the characteristics (but not limited to): physical deficiency, mental deficiency, moral deficiency, age, ethnicity, race, national origin, caste, religion, disability, serious disease, sex, gender, gender identity, gender reassignment, sexual orientation, and immigration status.
### About Dataset
The dataset comprises text data collected from Facebook comments and posts, primarily focusing on instances of hate speech. The data is in Burmese and includes a wide range of offensive and hateful expressions targeting individuals and groups based on various attributes such as ethnicity, politics, and personal characteristics.The dataset contains approximately 16.8k rows. This number reflects a diverse collection of hate speech instances, offering a comprehensive overview of the types of toxic language prevalent in the specific online community sampled.
### Disclaimer
The dataset may contain toxic data like using rude words, and these are not aligned with the definition of hate speech data mentioned above.
### Contributors:
Main Contributor: Sa Phyo Thu Htet
Other Contributors: Ei Thandar Aung, Naing Linn Phyo, Yang Ni Linn Lat, Chaw Su Thwe Thiha Nyein, Hnin Aye Thant, Ye Bhone Lin
Data Collectors: Sa Phyo Thu Htet, Students from Simbolo, Club Members of Data Science and Machine Learning Club, University of Technology, Yatanarpon Cyber City, Myanmar
提供机构:
simbolo-ai



