five

ALERT (Analysis of Linguistic Extremism in Religious Text)

收藏
DataCite Commons2025-05-09 更新2025-05-17 收录
下载链接:
https://data.mendeley.com/datasets/2pwtrtcc72/2
下载链接
链接失效反馈
官方服务:
资源简介:
The widespread dissemination of religiously aggressive content on social media platforms poses significant threats to social cohesion and communal harmony. Social media has emerged as a prevalent venue for discussing diverse topics, including religion which frequently result in debates. These debates often fuel animosity, incite violence, and spread life-threatening messages that disrupt societal peace and security. To address this challenge, we developed a novel Bengali dataset, ALERT, accompanied by English translations, to identify and classify religious aggression in texts. The dataset was obtained from several online platforms, including Facebook, YouTube, blogs, online news portals, and group discussions. We executed multiple stages for data preprocessing, including the elimination of duplicates, special characters, emoji to improve the coherence of the dataset. Each instance in the dataset was annotated by two of the lists of four annotators with diverse academic, religious, and racial backgrounds, with any discrepancies resolved by expert review. The ALERT dataset is a collection of 4,003 Bangla texts categorized as 1. hate speech (1,007), 2. vandalism (998), 3. life-threatening (994), 4. no aggression (1,004). The dataset is structured with the following fields: • Annotator 1 • Annotator 2 • Final Annotation • Text • English Translation Our developed dataset contains a mix of formal and informal Bangla texts, reflecting how people communicate in real life. Instead of simply labeling content as aggressive or not, it offers more detailed categories, helping with more precise content moderation. The dataset is publicly accessible for research purposes to promote innovation and collaboration within the Bengali NLP community.
提供机构:
Mendeley Data
创建时间:
2025-05-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作