Black-box Certification and Learning under Adversarial Perturbations

Hassan Ashtiani,u00a0Vinayak Pathak,u00a0Ruth Urner

We formally study the problem of classification under adversarial perturbations from a learneru2019s perspective as well as a third-party who aims at certifying the robustness of a given black-box classifier. We analyze a PAC-type framework of semi-supervised learning and identify possibility and impossibility results for proper learning of VC-classes in this setting. We further introduce a new setting of black-box certification under limited query budget, and analyze this for various classes of predictors and perturbation. We also consider the viewpoint of a black-box adversary that aims at finding adversarial examples, showing that the existence of an adversary with polynomial query complexity can imply the existence of a sample efficient robust learner.