Disaster Tweet Corpus 2020

NIAID Data Ecosystem2026-03-13 收录

下载链接：

https://zenodo.org/record/3713919

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset consists of tweets collected during 48 disasters over 10 disaster types with human annotations denoting if a tweet is related to this disaster or not. This collection is intended as a benchmarking dataset for filtering algorithms. Dataset Specification Tweets are separated into files based on individual disasters, where each file contains a balanced number of positive and negative examples. The naming scheme is as follows: -[-]-.ndjson Each line in the data files is a complete json-object, containing the tweet-id, the text, and the annotations as: {"id": "12345", "text": "let's all pray for nepal!", "relevance": 1} References To reference this collection as a whole, please use the following citation: Wiegmann, M., Kersten, J., Klan, F., Potthast, M., Stein, B. (2020). Analysis of Filtering Models for Disaster-Related Tweets. Proceedings of the 17th ISCRAM. This dataset compiles tweets collected, annotated, and published in several other works. Please consider to cite those too: 1. Imran, M., Castillo, C., Lucas, J., Meier, P., and Vieweg, S. (2014). AIDR: artificial intelligence for disaster response. In: WWW (Companion Volume). 2. Olteanu, A., Castillo, C., Diaz, F., and Vieweg, S. (2014). CrisisLex: A Lexicon for Collecting and Filtering Microblogged Communications in Crises. Proceedings of the 8th ICWSM. 3. Olteanu, A., Vieweg, S., and Castillo, C. (2015). What to Expect When the Unexpected Happens: Social Media Communications Across Crises. Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing. 4. Imran, M., Mitra, P., and Srivastava, J. (2016). Enabling Rapid Classification of Social Media Communications During Crises. IJISCRAM 8. 5. Alam, F., Ofli, F., and Imran, M. (2018). CrisisMMD: Multimodal Twitter Datasets from Natural Disasters. Proceedings of the 12th ICWSM. 6. Stowe, K., Palmer, M., Anderson, J., Kogan, M., Palen, L., Anderson, K. M., Morss, R., Demuth, J., and Lazrus, H. (2018). Developing and Evaluating Annotation Procedures for Twitter Data during Hazard Events. Proceedings of the LAW-MWE-CxG-2018. 7. McMinn, A. J., Moshfeghi, Y., and Jose, J. M. (2013). Building a Large-scale Corpus for Evaluating Event Detection on Twitter. Proceedings of the 22nd ACM CIKM.

创建时间：

2022-06-13

5,000+

优质数据集

54 个

任务类型

进入经典数据集