HumVI
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/dataminr-ai/humvi-dataset
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为HumVI,包含了三种语言(英语、法语、阿拉伯语)的新闻文章,这些文章中包含了不同类型的暴力事件实例,根据它们影响的人道主义部门进行分类,如援助安全、教育、食品安全、健康和保护等。该数据集还包括通过两种方法获得的标签:对训练数据采用在职标注(OTJ),对测试数据采用离线标注(OFL)。数据集的特点是标注者之间的一致性高,显示出较高的可靠性。规模上,该数据集包含了17,497篇标注文章,其任务是检测影响人道主义援助的暴力事件。
The dataset is named HumVI. It contains news articles in three languages: English, French, and Arabic, which include instances of various types of violent incidents categorized based on the humanitarian sectors they affect, such as aid security, education, food security, health, and protection. The dataset also includes labels obtained via two labeling methods: on-the-job (OTJ) labeling for the training data and offline labeling (OFL) for the test data. It features high inter-annotator agreement, demonstrating strong reliability. With a total of 17,497 annotated articles, the dataset is designed for the task of detecting violent incidents that impact humanitarian aid.
提供机构:
Insecurity Insight



