five

Coronavirus (COVID-19) Geo-tagged Tweets Dataset

收藏
Mendeley Data2024-01-31 更新2024-06-29 收录
下载链接:
https://ieee-dataport.org/open-access/coronavirus-covid-19-geo-tagged-tweets-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains the IDs of geo-tagged tweets. The tweets were captured by an on-going project deployed at https://live.rlamsal.com.np. The geolocation data was extracted from the tweets which mentioned anything about “corona”, "coronavirus", "covid" and possible variants of "sarscov2", "nCov", "covid-19", "ncov2019". Complying with Twitter's content redistribution policy, only the tweet IDs are shared. You can re-construct the dataset by hydrating these IDs. The tweet IDs in this dataset belong to the tweets tweeted providing an exact location. Please note that this dataset should be solely used for non-commercial research purposes (ignore every other LICENSE category given on this page).-------------------------------------------------------------------------Also: Coronavirus (COVID-19) Tweets Dataset (180+ million English Language Tweets; ongoing collection)-------------------------------------------------------------------------Note: I started sharing the IDs of the tweets that had exact 'point' location information, only since April 28, 2020, with some genuine requests coming in from academic researchers who did not want to hydrate the whole lists of IDs (above 180+ million tweets) shared in the Coronavirus (COVID-19) Tweets Dataset.If you need the geolocation-based data starting March 20, 2020, then use the Coronavirus (COVID-19) Tweets Dataset and hydrate the IDs while adding the following condition:data = json.loads(data)if data["coordinates"]: longitude, latitude = data["coordinates"]["coordinates"]The data is available in two formats: CSV and JSON. I'll be sharing new files every day, and the files will be named period-wise. For example, april28-june5.zip will contain tweet ID and sentiment score of the tweets (in CSV and JSON formats) that were created between April 28, 2020, and June 05, 2020.Why are only tweet IDs being shared? Twitter's content redistribution policy restricts the sharing of tweet information other than tweet IDs and/or user IDs. Twitter wants researchers always to pull fresh data. It is because a user might delete a tweet or make their profile protected. If the same tweet has already been pulled and shared on a public domain, it might make the user/community vulnerable to many inferences coming out of the shared data which currently does not exist or is private.
创建时间:
2024-01-31
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作