Code To Reproduce the Analysis of "Bluesky Network Topology, Polarization, and Algorithmic Curation"
收藏DataCite Commons2024-11-19 更新2025-04-15 收录
下载链接:
https://dataverse.harvard.edu/citation?persistentId=doi:10.7910/DVN/NGQKDS
下载链接
链接失效反馈官方服务:
资源简介:
Code To Reproduce the Analysis of "Bluesky Network Topology, Polarization, and Algorithmic Curation"<br><br>
Data Preparation<br><br>
1. Unzip DIDs.txt.gz which contains all 4,754,059 valid user DIDs used in this analysis<br>
2. Run the download script with:<br>
python download_repos_multip.py --mode all<br>
This will download all repositories for the DIDs and store them in the Data/DID_REPO/ folder.<br>
3. Run Code/PythonScripts/data_processing.py to create the SQL database.<br><br>
Main Analysis<br><br>
Data Processing Notebooks<br><br>
<b>00_CreateMBFC.ipynb</b> - Creates mapping of domains to political stances according to Media Bias Fact Check (MBFC)<br>
<b>01_CreateSQL.ipynb</b> - Creates SQL database and exports upload scripts<br><br>
Analysis Notebooks<br><br>
<b>02_ActivityOverTime.ipynb</b> - Figure 1: Activity over Time, Table 5: Top Domains<br>
<b>03_TrainTransformer.ipynb</b> - Trains model for stance detection on Israel/Palestine content<br>
<b>04_Stance.ipynb</b> - Figure 9: Proportion of Posts by Stance over Time, Table 1: User Activity Distributions, Figures 2 & 3: Distribution of Interactions, Figures 8 & 10: Heatmap of Ideologies vs Neighborhood Ideology, Figure 7: Distribution of Domains by Ideology<br>
<b>05_TopologyOverTime.ipynb</b> - Figures 4 & 5: Structural Measures over Time<br>
<b>06_Feeds.ipynb</b> - Table 2: Top Feeds on Bluesky, Table 4: Distributions of Feeds, Figure 6: Distribution of Interactions with Feeds<br>
<b>07_TopicModel.ipynb</b> - Table 3: Topic Model of Feeds
提供机构:
Harvard Dataverse
创建时间:
2024-11-18



