Optimal Convergence Rates for Agnostic Nystru00f6m Kernel Learning

Jian Li,u00a0Yong Liu,u00a0Weiping Wang

Nystru00f6m low-rank approximation has shown great potential in processing large-scale kernel matrix and neural networks. However, there lacks a unified analysis for Nystru00f6m approximation, and the asymptotical minimax optimality for Nystru00f6m methods usually require a strict condition, assuming that the target regression lies exactly in the hypothesis space. In this paper, to tackle these problems, we provide a refined generalization analysis for Nystru00f6m approximation in the agnostic setting, where the target regression may be out of the hypothesis space. Specifically, we show Nystru00f6m approximation can still achieve the capacity-dependent optimal rates in the agnostic setting. To this end, we first prove the capacity-dependent optimal guarantees of Nystru00f6m approximation with the standard uniform sampling, which covers both loss functions and applies to some agnostic settings. Then, using data-dependent sampling, for example, leverage scores sampling, we derive the capacity-dependent optimal rates that apply to the whole range of the agnostic setting. To our best knowledge, the capacity-dependent optimality for the whole range of the agnostic setting is first achieved and novel in Nystru00f6m approximation.