Simple techniques for improving deep neural network outcomes on commodity hardware

Abstract

We benchmark improvements in the performance of deep neural networks (DNN) on the MNIST data test upon implementing two simple modifications to the algorithm that have little overhead computational cost. First is GPU parallelization on a commodity graphics card, and second is initializing the DNN with random orthogonal weight matrices prior to optimization. Eigenspectra analysis of the weight matrices reveal that the initially orthogonal matrices remain nearly orthogonal after training. The probability distributions from which these orthogonal matrices are drawn are also shown to significantly affect the performance of these deep neural networks.