Reddit blackout announcements: 2023 API protest
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.qfttdz0qd
下载链接
链接失效反馈官方服务:
资源简介:
Starting June 12, 2023, many Reddit communities (subreddits) began a protest where they "went dark" - by changing to private mode - as a protest in response to Reddit's plans to change its API access policies and fee structure. Supporters of the protest criticize the planned changes for being prohibitively expensive for 3rd party apps. Beyond 3rd party apps, there is significant concern that the API changes are a move by the platform to increase monetization, degrade the user experience, and eventually kill off other custom features such as the old.reddit.com interface, the Reddit Enhancement Suite browser extension, and more. Additionally, there are concerns that the API changes will impede the ability of subreddit moderators (who are all unpaid users) to access tools to keep their communities on-topic and free of spam.
This dataset includes the "stickied" posts that appeared on 5,351 subreddits on June 11, 2023 and June 12, 2023 - including many subreddits announcing their plans to participate in the protest. These posts were scraped using a custom Python script that was written specifically for this purpose. Ironically, the script uses the PRAW (Python Reddit API Wrapper) library, requiring a valid Reddit API key. Accordingly, after the platform's new API pricing policy went into effect, it is no longer feasible for researchers to perform this type of web scraping without external funding support.
Methods
The list of subreddits was created from the ist of participating subreddits that had been collated in the /r/ModCoord subreddit. An initial Python script looks at three reddit posts and grabs the list of participating subreddits:
https://www.reddit.com/r/ModCoord/comments/1401qw5/incomplete_and_growing_list_of_participating/
https://www.reddit.com/r/ModCoord/comments/143fzf6/incomplete_and_growing_list_of_participating/
https://www.reddit.com/r/ModCoord/comments/146ffpb/incomplete_and_growing_list_of_participating/
It uses the requests library to get the HTTP response body. Then it uses re to search for links that look like <a href="/r/iphone/">r/iphone</a>, e.g. what the list looks like in the post. Next it's just a bit of string cleanup and then writing to an output file.
This script does not use the Reddit API at all. It's just basic HTTP requests.
A second Python script then reads that list and uses the Reddit API to request information about current posts in each subreddit. The script creates a CSV file for each of the listed subreddits and creates a new row for each "stickied" post. There currently isn't any logic to try and detect which post is the one announcing the blackout; I simply saved all of them. Many subreddits did not have any stickied posts at all, and many stickied posts were not related to the blackout.
创建时间:
2024-02-06



