Update README.md

b1abe79 verified about 9 hours ago

4.89 kB

license: mit
tags:
  - fairness
  - classification
metrics:
  - accuracy
papers:
  - https://arxiv.org/abs/2507.20708

Exposing the Illusion of Fairness (EIF): Biased models which results were later fairwashed

📌 Overview

This repository contains a collection of neural network models trained on seven tabular datasets for the study:

Exposing the Illusion of Fairness (EIF): Auditing Vulnerabilities to Distributional Manipulation Attacks
https://arxiv.org/abs/2507.20708

Codebase:
https://github.com/ValentinLafargue/Inspection

Results:
https://huggingface.co/datasets/ValentinLAFARGUE/EIF-Manipulated-distributions

Each model corresponds to a specific dataset and is designed to analyze fairness properties rather than maximize predictive performance.

🧠 Model Description

All models are multilayer perceptrons (MLPs) trained on tabular data.

Fully connected neural networks
Hidden layers: configurable (n_loop, n_nodes)
Activation: ReLU (optional)
Output: Sigmoid
Prediction: $\hat{Y} \in [0,1]$

📊 Datasets, Sensitive Attributes, and Disparate Impact

Dataset	Adult[1]	INC[2]	TRA[2]	MOB[2]	BAF[3]	EMP[2]	PUC[2]
Sensitive Attribute (S)	Sex	Sex	Sex	Age	Age	Disability	Disability
Disparate Impact (DI)	0.30	0.67	0.69	0.45	0.35	0.30	0.32

[1]: Becker, B. and Kohavi, R. (1996). Adult. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5XW20.306,
https://www.kaggle.com/datasets/uciml/adult-census-income.

[2]: Ding, F., Hardt, M., Miller, J., and Schmidt, L. (2021). Retiring adult: New datasets for fair machine learning. In Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W., editors, Advances in Neural Information Processing Systems.313,
https://github.com/socialfoundations/folktables.

[3]: Jesus, S., Pombal, J., Alves, D., Cruz, A., Saleiro, P., Ribeiro, R. P., Gama, J., and Bizarro, P. (2022). Turning the tables: Biased, imbalanced, dynamic tabular datasets for ml evaluation. In Advances in Neural Information Processing Systems,
https://www.kaggle.com/datasets/sgpjesus/bank-account-fraud-dataset-neurips-2022.

Notes

Adult dataset: 5,000 test samples
Other datasets: 20,000 test samples
Sensitive attributes are used for fairness evaluation

Results and manipulated results

The results obtained on the tests samples, and their fairwashed counterparts are directly available on Hugging Face.

📈 Predictive Performance (Accuracy)

Dataset	Accuracy
Adult Census Income	84%
Folktables Income (INC)	88%
Folktables Mobility (MOB)	84%
Folktables Employment (EMP)	77%
Folktables Travel Time (TRA)	72%
Folktables Public Coverage (PUC)	73%
Bank Account Fraud (BAF)	98%

Note: High performance on BAF is due to strong class imbalance.
Accuracy was not the main objective of this study.

🎯 Intended Use

These models are intended for:

Fairness analysis
Studying disparate impact and bias
Reproducing results from the EIF paper
Benchmarking fairness-aware methods

⚠️ Limitations and Non-Intended Use

Not designed for production
Not optimized for predictive performance
Should not be used for real-world decision-making

These models intentionally expose biases in standard ML pipelines.

⚖️ Ethical Considerations

This work highlights:

The presence of bias in machine learning models
The limitations of fairness metrics

Models should be interpreted as analytical tools, not fair systems.

📦 Repository Structure

Each dataset corresponds to a subfolder:

EIF-biased-classifier/
├── ASC_ADULT_model/
├── ASC_INC_model/
├── ASC_MOB_model/
├── ASC_EMP_model/
├── ASC_TRA_model/
├── ASC_PUC_model/
└── ASC_BAF_model/

Each folder contains:

config.json
model.safetensors

🚀 Usage

model = Network.from_pretrained(
    "ValentinLAFARGUE/EIF-biased-classifier",
    subfolder="ASC_INC_model"
)

📚 Citation

@misc{lafargue2026exposingillusionfairnessauditing,
      title={Exposing the Illusion of Fairness: Auditing Vulnerabilities to Distributional Manipulation Attacks}, 
      author={Valentin Lafargue and Adriana Laurindo Monteiro and Emmanuelle Claeys and Laurent Risser and Jean-Michel Loubes},
      year={2026},
      eprint={2507.20708},
      url={https://arxiv.org/abs/2507.20708}, 
}

🔍 Additional Notes

Models are intentionally simple to isolate fairness behavior
Results depend on preprocessing and sampling choices
Focus is on reproducibility