five

CrisisBench: Benchmarking Crisis-related Social Media Datasets for Humanitarian Information Processing

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://data.mendeley.com/datasets/rzj4m2r78m
下载链接
链接失效反馈
官方服务:
资源简介:
Time-critical analysis of social media streams is important for humanitarian organizations for planing rapid response during disasters. The crisis informatics research community has developed several techniques and systems for process- ing and classifying big crisis-related data posted on social media. However, due to the dispersed nature of the datasets used in the literature (e.g., for training models), it is not pos- sible to compare the results and measure the progress made towards building better models for crisis informatics tasks. In this work, we attempt to bridge this gap by combining various existing crisis-related datasets. We consolidate eight human-annotated datasets and provide 166.1k and 141.5k tweets for informativeness and humanitarian classification tasks, respectively. We believe that the consolidated dataset will help train more sophisticated models. Moreover, we pro- vide benchmarks for both binary and multiclass classifica- tion tasks using several deep learning architecrures including, CNN, fastText, and transformers. We make the dataset and scripts available at: https://crisisnlp.qcri.org/ crisis_datasets_benchmarks.html
创建时间:
2021-04-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作