Replication data for: Effects of algorithmic flagging on fairness: Quasi-experimental evidence from Wikipedia

DataONE2021-04-26 更新2024-06-08 收录

下载链接：

https://search.dataone.org/view/sha256:c00f30e8642430e0223bcc87a394213c5864033afc5f368f6b50b8096bbeb146

下载链接

链接失效反馈

官方服务：

资源简介：

Code overview The code is all in code.tar.gz. Identifying thresholds and cutoffs over time Pretty much all in identify_cutoffs.py. Iterates git repository, parses wmf-config/InitializeSettings.php. Interprets historical json versions. Builds a pandas table of events with threshold configuration settings and some other configuration settings like when different UI elements were enabled. There are some traces of the first attempt at a project, an attempted time series analysis that failed due to high noise. In cases where thresholds are not configured, default thresholds are configured in the ORES service repository mediawiki-extentions-ORES/extension.json (a copy of the git repository is in mediawiki-extensions-ORES.tar.gz. get_default_threshold_strings.py scripts this git repository to get the history of the default thresholds. They don’t change much. Reading server admin log Right now, the code to get the history of deployments is in a chunk at the identify_cutoffs.py. I think I will refactor this to its own file. The precise timing of changes to the models does not come from the source code repository but rather the live deployments. The SAL (server admin log) publishes a history of live deployments. Converting ORES configuration strings to prediction score cutoffs This is done by ores_archaeologist.py. This is by far the most complex complex script and it wraps functionality from the revscoring package (a copy of this repository is in revscoring.tar.gz) to load different versions of models and analyze them. It checks out git commits corresponding to changes in InitializeSettings.php or SAL, installs the correct python dependencies in a helper repository to make sure the models run in as close as possible to the correct environment to ensure the thresholds are correct. helper.py has functions used by ores_archaeologist.py. Sometimes there are errors and we start analyzing data starting after the last error to give a continuous period. get_model_threshold.py is a simple script that is run by ores_archeologist.py and actually loads the revscoring code. The ores_archeologist.py script can also attempt to find historical revision scores. This was not actually used in the paper because these historical scores may not be reliable. revscoring_score_shim.py is analogous to get_model_threshold.py, but for scoring edits. Sampling from Wikimedia history and event table sample_edits_near_thresholds.py is a spark script that runs on the Wikimedia Foundation datalake nad builds the revision dataset. Much of the logic is inspark_functions.py. Fitting models. The master file is fit_10_rdds.R and fit_vlb_rdds.R. During the review cycle we found a bug in the ‘very likely bad’ data and I refit only those models to save time. fit_10_rdds.R just fits the models asynchronously. The main logic is in fit_base_rdds.R and modeling_init.R. The dataset is put together in modeling_init.R. ob_util.R and helper.R have a few miscellanious functions. rdd_defaults.R has the formulas and sets stan modeling parameters. Fit models are available in models.tar.gz. Interpreting models. analyze_threshold_models.R builds smaller dataframes and variables that will be used by the Knitr Latex system to build the paper. analyze_vlb_models.R does the same, but just for the ‘very likely bad’ data. Code shared by both scripts are in analyze_main_models.R. Dataset summary statistics Some additional statistics reported in the paper are calculated in summary_stats.R. Evaluating encoded bias The bias_analysis.tar.gz archive has code and data used for evaluating the bias of the ORES models including a copy of the editquality git repository. A copy of the repository is in editquality.tar.gz. Building the paper and appendix This is in the paper.tar.gz and appendix.tar.gz archives. Data files overview The following data files are published at the top level of the dataverse. Copy them into a data subdirectory to use them with the code. cutoff_revisions_2periods.csv.gz.part1 and cutoff_revisions_2periods.csv.gz.part2 have the full dataset of edits within the neighborhood.You should do cat cutoff_revisions_2periods.csv.gz.part1 cutoff_revisions_2periods.csv.gz.part2 > cutoff_revisions_2periods.csv.gz and then decompress the output to get the full csv. cutoff_revisions_sample.csv and cutoff_revisions_sample_vlbfix.csv have the sampled datasets on which the models are fit. threshold_strata_counts.csv and threshold_strata_counts_vlbfix.csv have the counts from statified sampling which are used to calculate modeling weights. What does vlb_fix mean? The original submission of the paper contained a bug that affected the sample at the verylikelybad RCFilters threshold. The bug was on line 241 of sample_edits_near_threshold.py and lead to NA values in the sample which would have affected the sample weights. During the revise-and-resubmit process we found and fixed the bug and fit new models at the verylikelybad threshold. LICENSE The data in this repository is released under a CC0 license. The original code is released under an MIT permissive license. Code included through external git repositories is repackaged here and released under the applicable open source licenses. See the pacakges for details.

创建时间：

2023-11-19

5,000+

优质数据集

54 个

任务类型

进入经典数据集