simbolo-ai/burmese-hatespeech

Name: simbolo-ai/burmese-hatespeech
Creator: simbolo-ai
Published: 2024-09-27 16:43:47
License: 暂无描述

Hugging Face2024-09-27 更新2025-04-19 收录

下载链接：

https://hf-mirror.com/datasets/simbolo-ai/burmese-hatespeech

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - my license: gpl --- ### What is Hate Speech? Hate speech is any toxic communication used to attack individuals or groups directly, especially based on the characteristics (but not limited to): physical deficiency, mental deficiency, moral deficiency, age, ethnicity, race, national origin, caste, religion, disability, serious disease, sex, gender, gender identity, gender reassignment, sexual orientation, and immigration status. ### About Dataset The dataset comprises text data collected from Facebook comments and posts, primarily focusing on instances of hate speech. The data is in Burmese and includes a wide range of offensive and hateful expressions targeting individuals and groups based on various attributes such as ethnicity, politics, and personal characteristics.The dataset contains approximately 16.8k rows. This number reflects a diverse collection of hate speech instances, offering a comprehensive overview of the types of toxic language prevalent in the specific online community sampled. ### Disclaimer The dataset may contain toxic data like using rude words, and these are not aligned with the definition of hate speech data mentioned above. ### Contributors: Main Contributor: Sa Phyo Thu Htet Other Contributors: Ei Thandar Aung, Naing Linn Phyo, Yang Ni Linn Lat, Chaw Su Thwe Thiha Nyein, Hnin Aye Thant, Ye Bhone Lin Data Collectors: Sa Phyo Thu Htet, Students from Simbolo, Club Members of Data Science and Machine Learning Club, University of Technology, Yatanarpon Cyber City, Myanmar

提供机构：

simbolo-ai

5,000+

优质数据集

54 个

任务类型

进入经典数据集