five

IEHate

收藏
arXiv2023-06-28 更新2024-06-21 收录
下载链接:
https://github.com/Farhan-jafri/Indian-Election
下载链接
链接失效反馈
官方服务:
资源简介:
IEHate数据集由贾米亚米利亚伊斯兰大学创建,包含11,457条手动标注的印度选举期间发布的印地语推文。该数据集主要用于研究政治讨论中的仇恨言论,特别是在低资源语言环境下的挑战。数据集内容涵盖了2022年印度各州议会选举期间的推文,由四名标注者根据是否包含仇恨言论进行分类。创建过程中,首先通过Twitter API收集推文,然后进行多阶段标注以确保高质量的标注结果。该数据集的应用领域主要集中在开发和评估低资源语言环境下的仇恨言论检测技术,旨在促进更健康、更包容的民主讨论环境。

The IEHate dataset was created by Jamia Millia Islamia University, containing 11,457 manually annotated Hindi tweets published during Indian elections. This dataset is primarily used to study hate speech in political discourse, especially the challenges in low-resource language environments. The dataset covers tweets collected during the 2022 Indian state legislative assembly elections, and was classified by four annotators based on whether the content contained hate speech. During its creation, tweets were first gathered via the Twitter API, followed by multi-stage annotation to ensure high-quality annotation results. The main application scenarios of this dataset focus on developing and evaluating hate speech detection technologies in low-resource language environments, aiming to foster a healthier and more inclusive democratic discourse environment.
提供机构:
贾米亚米利亚伊斯兰大学
创建时间:
2023-06-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作