five

Coronavirus (COVID-19) Geo-tagged Tweets Dataset

收藏
Mendeley Data2024-01-31 更新2024-06-29 收录
下载链接:
https://ieee-dataport.org/open-access/coronavirus-covid-19-geo-tagged-tweets-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains the IDs of geo-tagged tweets. The tweets are captured by an on-going project deployed at https://live.rlamsal.com.np. The model monitors the real-time Twitter feed for these keywords - “corona”, "coronavirus", "covid", "pandemic", "lockdown", "quarantine", "hand sanitizer", "ppe", "n95", different possible variants of "sarscov2", "nCov", "covid-19", "ncov2019", "2019ncov", "flatten(ing) the curve", "social distancing", "work(ing) from home" and the respective hashtag of all these keywords. Complying with Twitter's content redistribution policy, only the tweet IDs are shared. You can re-construct the dataset by hydrating these IDs. The tweet IDs in this dataset belong to the tweets tweeted providing an exact location. Please note that this dataset should be solely used for non-commercial research purposes (ignore every other LICENSE category given on this page).-------------------------------------------------------------------------Coronavirus (COVID-19) Tweets Dataset(190+ Million English Language Tweets; ongoing collection)-------------------------------------------------------------------------Note: I started sharing the IDs of the tweets that had exact 'point' location information, only since April 28, 2020, with some genuine requests coming in from academic researchers who did not want to hydrate the whole lists of IDs shared in the Coronavirus (COVID-19) Tweets Dataset.Update: I have received a lot of requests, especially from Social Science researchers, to also make the geo-tagged tweets created between March 20, 2020, and April 28, 2020, available in this dataset. Hydrating the millions of tweet IDs may come as a tedious task for people with less technical expertise. Therefore, I have started hydrating the IDs provided in the Coronavirus (COVID-19) Tweets Dataset, and I will be sharing the geo-tagged tweets posted in between these dates as the hydration task goes on. I'll be adding new CSV files, and the naming convention for these newly added files will be day-wise (instead of period-wise). Bookmark this page for further updates.The data is available in two formats: CSV and JSON. I'll be sharing new files every day, and the files will be named period-wise. For example, april28-june5.zip will contain tweet ID and sentiment score of the tweets (in CSV and JSON formats) that were created between April 28, 2020, and June 05, 2020.Why are only tweet IDs being shared? Twitter's content redistribution policy restricts the sharing of tweet information other than tweet IDs and/or user IDs. Twitter wants researchers always to pull fresh data. It is because a user might delete a tweet or make their profile protected. If the same tweet has already been pulled and shared on a public domain, it might make the user/community vulnerable to many inferences coming out of the shared data which currently does not exist or is private.
创建时间:
2024-01-31
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作