Intra-Processing Methods for Debiasing Neural Networks

Yash Savani,Colin White,Naveen Sundar Govindarajulu

In this work, we initiate the study of a new paradigm in debiasing research, intra-processing, which sits between in-processing and post-processing methods. Intra-processing methods are designed specifically to debias large models which have been trained on a generic dataset, and fine-tuned on a more specific task. We show how to repurpose existing in-processing methods for this use-case, and we also propose three baseline algorithms: random perturbation, layerwise optimization, and adversarial debiasing. We evaluate these methods across three popular datasets from the AIF360 toolkit, as well as on the CelebA faces dataset. Our code is available at https://github.com/abacusai/intraprocessing_debiasing.