Supplementary material (data and code) for the paper "Using Wikipedia Pageview Data to Investigate Public Interest in Climate Change at a Global Scale"
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10623664
下载链接
链接失效反馈官方服务:
资源简介:
Code and data to recreate analyses and Figures from the paper 'Using Wikipedia Pageview Data to Investigate Public Interest in Climate Change at a Global Scale' https://doi.org/10.1145/3614419.3644007
The analysis is based on the following data and code files:
(0) The database wiki_climate.duckdb contains the pageview data. It is stored for convenience in the database but is the same data that is described and can be downloaded from this Wikimedia blogpost . The main analysis is performed in wiki_climate_analysis.R.
(1) wikidata_cc_results.csv: Contains the wikidata QIDs for all concepts that are part in WikiProject climate change, including whether this concept exists in the top 25 most visited language editions.
(2) wiki_cc_article_size.csv: is the article size in bytes for all articles of WPCC in these LE (to create Figure 1) For this the function in getWikiPageSize.R was used.
(3) urls_wikidataid_map.csv and redirected_urls_wikidatamap.csv contain the QIDs of all unique URLs present in the duck_db Wikipedia pageview data. Now we can filter out only those that are part of WPCC
(4) WP2022_Demographic_Indicators_Medium.csv: Is used to get the population for every year so per capita pageviews can be calculated.
(5) On of the main functions is getCountryCCTotalDay() from getCountryCCTotalDay.R which needs to run manually for each year. These years are merged and then later saved for convenience in daily_cc_interest_total_per_country.csv
(6) The causal impact analysis can be done using the file wiki_cc_causal_impact.R.
创建时间:
2024-02-06



