SRC21. Quantization for Energy Efficient Convolutional Neural Networks
Student: Joao Vitor Mascarenhas (Federal University of Ouro Preto)
Supervisor: Daniela Ushizima (Lawrence Berkeley National Laboratory)
Abstract: A traditional Convolutional Neural Network (CNN) is parameterized by floating point weights and biases and takes floating point data as input. In many cases, the floating point representation of these parameters and input is more than necessary. The use of a more compact representation of the parameters and input allows CNNs to be deployed on energy efficient architectures that operate with a few bits and much lower memory footprint. This work focuses on data reduction and quantization schemes that can be applied to a trained CNN for classifying scientific simulation data. We show that each neuron and synapse can be encoded with only one byte to maintain accuracy above 98%.
Two-page extended abstract: pdf