Regression with Label Differential Privacy

Badih Ghazi,Pritish Kamath,Ravi Kumar,Ethan Leeman,Pasin Manurangsi,Avinash Varadarajan,Chiyuan Zhang

We study the task of training regression models with the guarantee of _label_ differential privacy (DP). Based on a global prior distribution of label values, which could be obtained privately, we derive a label DP randomization mechanism that is optimal under a given regression loss function. We prove that the optimal mechanism takes the form of a "randomized response on bins", and propose an efficient algorithm for finding the optimal bin values. We carry out a thorough experimental evaluation on several datasets demonstrating the efficacy of our algorithm.