Neural Path Features and Neural Path Kernel : Understanding the role of gates in deep learning

Chandrashekar Lakshminarayanan, Amit Vikram Singh

In this paper, we analytically characterise the role of gates and active sub-networks in deep learning. To this end, we encode the on/off state of the gates for a given input in a novel 'neural path feature' (NPF), and the weights of the DNN are encoded in a novel 'neural path value' (NPV). Further, we show that the output of network is indeed the inner product of NPF and NPV. The main result of the paper shows that the 'neural path kernel' associated with the NPF is a fundamental quantity that characterises the information stored in the gates of a DNN. We show via experiments (on MNIST and CIFAR-10) that in standard DNNs with ReLU activations NPFs are learnt during training and such learning is key for generalisation. Furthermore, NPFs and NPVs can be learnt in two separate networks and such learning also generalises well in experiments. In our experiments, we observe that almost all the information learnt by a DNN with ReLU activations is stored in the gates - a novel observation that underscores the need to investigate the role of the gates in DNNs.