Twitter data discourses about COVID vaccine
收藏DataONE2026-02-03 更新2026-02-14 收录
下载链接:
https://search.dataone.org/view/sha256:c96f18669bf8a806858de693e45e0767f5d95176b3fb98ad5e4f6fde3f89241b
下载链接
链接失效反馈官方服务:
资源简介:
This repository contains the data and code required to replicate the results reported in the paper: “Conspiracist attributes differentiate pro- and anti-vaccine online discourses about data.” ## Repository Structure The analysis pipeline consists of five Python notebooks and one R notebook, which must be run sequentially. ## Data Collection 00_collect_data_from_twitter.ipynb contains the code used to collect the original Twitter data. The data collection produces two tweet datasets: anti-vaccine data discourse and pro-vaccine data discourse. Due to Twitter/X data-sharing restrictions, we provide dehydrated datasets (tweet IDs only): - anti_vax_ids.json - pro_vax_ids.json To proceed with variable construction, these tweet IDs must be rehydrated using the Twitter API. The fully hydrated tweet datasets are available upon request. The rehydrated data files should be named as anti_vax.json and pro_vax.json in order to be used in the following analysis pipeline. ## Variable Construction The following notebooks construct the independent and control variables used in the paper’s analyses: - 01_construct_variable_certainty.ipynb - 02_construct_variable_causal_claims.ipynb - 03_construct_variable_authority_figures.ipynb - 04_construct_control_variables.ipynb Details are provided in the Methods section of the paper. Important: To run 01_construct_variable_certainty.ipynb and 02_construct_variable_causal_claims.ipynb, you must obtain the LIWC2015 English dictionary, which is available from https://liwc.app. ## Analysis Preparation 05_preparing_analysis_tables.ipynb combines the variables generated in notebooks 01–04 into analysis ready CSV files. We provide these processed datasets directly as: - authority_analysis.csv - causal_claims_analysis.csv - certainty_analysis.csv ## Statistical Analysis 06_full_analysis.Rmd contains the R code that uses the three CSV files above to reproduce all tables and figures reported in the paper.
创建时间:
2026-02-06



