Importance of initialization of weight matrices in deep learning neural networks

Abstract

The success of deep neural networks relies on optimized weight matrices are initialized in different ways. This work reports learning improvement in a six-layer deep neural network that is initialized with orthogonal weight matrices when compared to other commonly-used initialization schemes. An analysis of the eigenvalue spectra of the optimized solutions implies that the space of orthogonal weight matrices lies close to the manifold of learned states.