COVID-19 Tweets : A dataset contaning more than 600k tweets on the novel CoronaVirus
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/4024176
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains 653 996 tweets related to the Coronavirus topic and highlighted by hashtags such as: #COVID-19, #COVID19, #COVID, #Coronavirus, #NCoV and #Corona. The tweets' crawling period started on the 27th of February and ended on the 25th of March 2020, which is spread over four weeks.
The tweets were generated by 390 458 users from 133 different countries and were written in 61 languages. English being the most used language with almost 400k tweets, followed by Spanish with around 80k tweets.
The data is stored in as a CSV file, where each line represents a tweet. The CSV file provides information on the following fields:
Author: the user who posted the tweet
Recipient: contains the name of the user in case of a reply, otherwise it would have the same value as the previous field
Tweet: the full content of the tweet
Hashtags: the list of hashtags present in the tweet
Language: the language of the tweet
Relationship: gives information on the type of the tweet, whether it is a retweet, a reply, a tweet with a mention, etc.
Location: the country of the author of the tweet, which is unfortunately not always available
Date: the publication date of the tweet
Source: the device or platform used to send the tweet
The dataset can as well be used to construct a social graph since it includes the relations "Replies to", "Retweet", "MentionsInRetweet" and "Mentions".
创建时间:
2021-01-23



