Agnostic Learning of Halfspaces with Gradient Descent via Soft Margins

Spencer Frei,u00a0Yuan Cao,u00a0Quanquan Gu

We analyze the properties of gradient descent on convex surrogates for the zero-one loss for the agnostic learning of halfspaces. We show that when a quantity we refer to as the extit{soft margin} is well-behavedu2014a condition satisfied by log-concave isotropic distributions among othersu2014minimizers of convex surrogates for the zero-one loss are approximate minimizers for the zero-one loss itself. As standard convex optimization arguments lead to efficient guarantees for minimizing convex surrogates of the zero-one loss, our methods allow for the first positive guarantees for the classification error of halfspaces learned by gradient descent using the binary cross-entropy or hinge loss in the presence of agnostic label noise.