Comprehensive Collection of English, German, Russian and Ukrainian Tweets Containing the Word or Hashtag Ukraine During the Russian Invasion, February 2022 until May 2023
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10930821
下载链接
链接失效反馈官方服务:
资源简介:
Comprehensive dataset of Tweets containing the keyword 'ukraine' (in German, Russian and Ukrainian) as well as '#ukraine' (in English) since the Russian Ukraine Invasion in February 2022. The user handle column has been excluded to protect deleted accounts that have not been retweeted or replied to. Tweets have been collected via the Academic API using the Search endpoint in four languages:
Language
Query
Name in Dataset (column: event)
Number of Tweets
English
#ukraine AND lang='en'
ukraine-en-hashtag
45.8 million
German
ukraine AND lang='de'
ukraine
19.6 million
Russian
Украина AND lang:ru
ukraine-ru
5.02 million
Ukrainian
Україна AND lang:uk
ukraine-uk
4.1 million
Collection dates
Details on collection dates per Tweet (e.g. to compare with creation dates) as well as the IDs of Tweets for consistency checks can be found here: https://github.com/Leibniz-HBI/ukraine_twitter_data (https://doi.org/10.17605/OSF.IO/RTQXN)
File Naming Scheme
To enable downloads of selected timeframes and languages, the files are named by language, start and end date of the tweet creation timestamp.
Columns
The following columns are available:
event: tag for query and language used for the query
id: Tweet ID
inserted_at: collection date
last_updated_at: last update date (relevant for metrics such as follower count)
text: Tweet text
lang: language as determined by Twitter
created_at: creation date of the Tweet
conversation_id: Tweets with the same ID are part of the same reply tree to a tweet (provided by Twitter)
author_follower_count: follower count of the Tweet's author account at the creation or last update time of the tweet
replied_to: account the Tweet replies to
replied_to_follower_count: follower count of the Tweet's replied to account at the creation or last update time of the tweet
quoted: if quote tweet, ID of quoted tweet
quoted_follower_count: analog to replied_to_follower_count
retweeted: analog to quoted
retweeted_follower_count: analog to replied_to_follower_count
hashtags: hashtags of the Tweet
urls: URLs in the tweet, shortened/unshortened, including links to media
place_id: alphanumeric place ID provided by the Twitter API, mostly empty
创建时间:
2024-04-11



