ValentinLAFARGUE/EIF-Manipulated-distributions

Name: ValentinLAFARGUE/EIF-Manipulated-distributions
Creator: ValentinLAFARGUE
Published: 2026-03-20 17:04:05
License: 暂无描述

Hugging Face2026-03-20 更新2026-03-21 收录

下载链接：

https://hf-mirror.com/datasets/ValentinLAFARGUE/EIF-Manipulated-distributions

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit task_categories: - table-question-answering tags: - Auditing - Manipulation-proof - Fairwashing size_categories: - 10K<n<100K papers: - https://arxiv.org/abs/2507.20708 --- # Exposing the Illusion of Fairness (EIF) Manipulations Results We consider use cases where the auditee has developed a model which has fairness issues. It tries to hide the problem by picking a subsample while optimizing the fairness metric that will be computed by the auditor. Yet, from the supervisory authority, submitting a non-representative sample constitute a deceptive attempt by the auditee to obstruct or distort the assessment. We present here the original empirical distribution (through the original dataset), and the manipulated distributions. From those manipulated distributions, we can study how to effectively detect the manipulation paradigm. ## 📌 Overview This dataset accompanies the paper: **Exposing the Illusion of Fairness (EIF): Auditing Vulnerabilities to Distributional Manipulation Attacks** https://arxiv.org/abs/2507.20708 HF Models repository: https://huggingface.co/ValentinLAFARGUE/EIF-biased-classifiers Code repository: https://github.com/ValentinLafargue/Inspection It contains preprocessed tabular datasets, model outputs, and fairness-related quantities used to analyze bias and mitigation strategies. For each dataset, the original distribution is to be compared with the manipulated ones. ## 📊 Dataset Description The dataset is organized by **benchmark dataset**, each containing multiple NumPy arrays: - Original data - Model predictions and thresholds - Fairness metrics (Disparate Impact) - Gradients and mitigation-related quantities ## 📁 Structure EIF-dataset/ ├── ADULT/ ├── ASC_INC/ ├── ASC_MOB/ ├── ASC_EMP/ ├── ASC_TRA/ ├── ASC_PUC/ ├── BAF/ └── CelebA/ Each folder contains `.npy` files such as: - `original.npy` → original test dataset features (including S and Ŷ) - `DI.npy` → Disparate Impact values - `acc.npy` → model accuracy - `threshold.npy` → logits decision threshold - `Miti_mod_SF.npy` → Distribution manipulation through the Replace (S,Ŷ) method - `Miti_sampling_X.npy` → Distribution manipulation through the Matching (X,S,Ŷ) method - `Miti_Gems_mean.npy` → Distribution manipulation through Gems (Entropic projection), proportional variant - `Miti_Gems_number.npy` → Distribution manipulation through Gems (Entropic projection), balanced variant - `Grad_reg_me.npy` → Distribution manipulation through Wasserstein gradient method, proportional variant - `Grad_reg_nu.npy` → Distribution manipulation through Wasserstein gradient method, balanced variant - `Grad_la_me.npy` → Distribution manipulation through Wasserstein gradient method, 1D-projection, proportional variant - `Grad_la_nu.npy` → Distribution manipulation through Wasserstein gradient method, 1D-projection, balanced variant ## 📊 Datasets, Sensitive Attributes, and Disparate Impact The Disparate Impact is the ratio of positive outcome rates between groups. The "groups" are defined, or separated using a so-called *sensitive attribute*, which is also called in legal texts a *protected attribute*. | Dataset | Adult[1] | INC[2] | TRA[2] | MOB[2] | BAF[3] | EMP[2] | PUC[2] | |--------|------|-----|-----|-----|-----|-----|-----| | **Sensitive Attribute (S)** | Sex | Sex | Sex | Age | Age | Disability | Disability | | **Disparate Impact (DI)** | 0.30 | 0.67 | 0.69 | 0.45 | 0.35 | 0.30 | 0.32 | ``` [1]: Becker, B. and Kohavi, R. (1996). Adult. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5XW20.306, https://www.kaggle.com/datasets/uciml/adult-census-income. [2]: Ding, F., Hardt, M., Miller, J., and Schmidt, L. (2021). Retiring adult: New datasets for fair machine learning. In Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W., editors, Advances in Neural Information Processing Systems.313, https://github.com/socialfoundations/folktables. [3]: Jesus, S., Pombal, J., Alves, D., Cruz, A., Saleiro, P., Ribeiro, R. P., Gama, J., and Bizarro, P. (2022). Turning the tables: Biased, imbalanced, dynamic tabular datasets for ml evaluation. In Advances in Neural Information Processing Systems, https://www.kaggle.com/datasets/sgpjesus/bank-account-fraud-dataset-neurips-2022. ``` ## 🎯 Intended Use We propose a methodology that simulate how stakeholders could try to evade an audit on the Disparate Impact ratio, without any liability. The auditee aims to construct a dataset whose distribution is optimally close to the distribution of the original data, while ensuring that the fairness measure is above a threshold, as required by the regulations. We provide here the results of the different manipulation strategies, which we later detect using distribution tests. - Analyzing auditing framework and vulnerability - Studying distribution manipulation strategies - Reproducing EIF experiments - Benchmarking distribution manipulation strategies - Benchmarking manipulation-proof strategies through statistical tests ## ⚠️ Limitations - Not intended for production use - Contains synthetic and manipulated data - Original classification performance is not optimized ## ⚖️ Ethical Considerations This work studies how malicious actors could manipulate audit datasets to appear compliant with fairness metrics such as Disparate Impact. Our objective is to expose these vulnerabilities in order to strengthen auditing procedures and regulatory oversight. By analyzing both manipulation strategies and statistical detection methods, we aim to support the development of more robust fairness auditing frameworks. It should be used for **research and analysis only**. --- ## 🚀 Usage For a singular file: ```python path = hf_hub_download( repo_id="ValentinLAFARGUE/EIF-Manipulated-distributions", filename="ASC_INC/DI.npy", repo_type="dataset" ) arr = np.load(path) ``` For all files: ``` from huggingface_hub import list_repo_files files = list_repo_files("ValentinLAFARGUE/EIF-Manipulated-distributions", repo_type = "dataset") dic_arr_results = {} for file in files: if file[-4:] == '.npy': file_split = file.split('/') folder, subfile_name = file_split[0], file_split[1] path = hf_hub_download( repo_id="ValentinLAFARGUE/EIF-Manipulated-distributions", filename= file , repo_type="dataset" ) arr = np.load(path) try: dic_arr_results[folder][subfile_name[:-4]] = arr except: dic_arr_results[folder] = {} dic_arr_results[folder][subfile_name[:-4]] = arr print(f'successfuly loaded {file}') ``` ## 📚 Citation ``` @misc{lafargue2026exposingillusionfairnessauditing, title={Exposing the Illusion of Fairness: Auditing Vulnerabilities to Distributional Manipulation Attacks}, author={Valentin Lafargue and Adriana Laurindo Monteiro and Emmanuelle Claeys and Laurent Risser and Jean-Michel Loubes}, year={2026}, eprint={2507.20708}, url={https://arxiv.org/abs/2507.20708}, } ```

提供机构：

ValentinLAFARGUE

5,000+

优质数据集

54 个

任务类型

进入经典数据集