Exact expressions for double descent and implicit regularization via surrogate random design

Michal Derezinski,Feynman T. Liang,Michael W. Mahoney

Double descent refers to the phase transition that is exhibited bythe generalization error of unregularized learning models when varying the ratiobetween the number of parameters and the number of trainingsamples. The recent success of highly over-parameterized machine learningmodels such as deep neural networks has motivated a theoretical analysis ofthe double descent phenomenon in classical models such as linearregression which can also generalize well in the over-parameterizedregime. We provide the first exact non-asymptoticexpressions for double descent of the minimum norm linearestimator. Our approach involves constructing a specialdeterminantal point process which we call surrogate randomdesign, to replace the standard i.i.d. design of the trainingsample. This surrogate design admits exact expressions for the meansquared error of the estimator while preserving the key propertiesof the standard design. We also establish an exact implicitregularization result for over-parameterized training samples. Inparticular, we show that, for the surrogate design, the implicit biasof the unregularized minimum norm estimator precisely corresponds tosolving a ridge-regularized least squares problem on the populationdistribution. In our analysis we introduce a new mathematical tool ofindependent interest: the class of random matrices for whichdeterminant commutes with expectation.