Sharp Representation Theorems for ReLU Networks with Precise Dependence on Depth

Guy Bresler,Dheeraj Nagaraj

This constitutes a fine-grained characterization of the representation power of feedforward networks of arbitrary depth D and number of neurons N, in contrast to existing representation results which either require D growing quickly with N or assume that the function being represented is highly smooth. In the latter case similar rates can be obtained with a single nonlinear layer. Our results confirm the prevailing hypothesis that deeper networks are better at representing less smooth functions, and indeed, the main technical novelty is to fully exploit the fact that deep networks can produce highly oscillatory functions with few activation functions.