Improved guarantees and a multiple-descent curve for Column Subset Selection and the Nystrom method
Michal Derezinski,Rajiv Khanna,Michael W. Mahoney
The Column Subset Selection Problem (CSSP) and the Nystrom methodare among the leading tools for constructing small low-rankapproximations of large datasets in machine learning and scientificcomputing. A fundamental question in this area is: how well can a data subset ofsize k compete with the best rank k approximation?We develop techniques which exploit spectral properties of the datamatrix to obtain improved approximation guarantees which go beyond thestandard worst-case analysis. Our approach leads to significantly better bounds for datasets withknown rates of singular value decay, e.g., polynomial or exponential decay. Our analysis also reveals an intriguing phenomenon: the approximationfactor as a function of k may exhibit multiple peaks and valleys,which we call a multiple-descent curve. A lower bound we establish shows that this behavior is not an artifactof our analysis, but rather it is an inherent property of the CSSP andNystrom tasks. Finally, using the example of a radial basis function (RBF)kernel, we show that both our improved bounds and the multiple-descentcurve can be observed on real datasets simply by varying the RBF parameter.


