COVID19_Tweets_Dataset
收藏arXiv2020-03-24 更新2024-06-21 收录
下载链接:
https://github.com/lopezbec/COVID19_Tweets_Dataset
下载链接
链接失效反馈官方服务:
资源简介:
COVID19_Tweets_Dataset是由拉斐特学院创建的多语言Twitter数据集,涵盖了自2020年1月22日至3月13日的6,468,526条推文。该数据集通过Twitter API持续收集,主要关键词包括virus, coronavirus等。数据集内容丰富,涉及66种语言,其中英语推文占比最高,达63.4%。创建过程中,数据集从仅收集英语推文逐步扩展至多语言。该数据集主要用于分析COVID-19疫情期间的公众舆论和信息传播,有助于政策制定者和研究人员理解疫情对社会的影响及公众反应。
The COVID19_Tweets_Dataset is a multilingual Twitter dataset developed by Lafayette College, comprising 6,468,526 tweets collected between January 22 and March 13, 2020. This dataset was continuously curated via the Twitter API, with primary keywords including "virus", "coronavirus", and other related terms. It encompasses diverse content across 66 languages, where English-language tweets constitute the largest share at 63.4%. During its development, the dataset evolved from initially collecting only English tweets to expanding to a multilingual corpus. This dataset is primarily utilized for analyzing public opinion and information dissemination during the COVID-19 pandemic, enabling policymakers and researchers to gain insights into the societal impacts of the pandemic and public responses thereto.
提供机构:
拉斐特学院
创建时间:
2020-03-24



