A fully connected layer elimination for a binarizec convolutional neural network on an FPGA

A pre-trained convolutional deep neural network (CNN) is widely used for embedded systems, which requires highly power-and-area efficiency. In that case, the CPU is too slow, the embedded GPU dissipates much power, and the ASIC cannot keep up with the rapidly progress of the CNN variations. This pap...

Full description

Saved in:
Bibliographic Details
Published inInternational Conference on Field-programmable Logic and Applications pp. 1 - 4
Main Authors Nakahara, Hiroki, Fujii, Tomoya, Sato, Shimpei
Format Conference Proceeding
LanguageEnglish
Japanese
Published Ghent University 01.09.2017
Subjects
Online AccessGet full text
ISSN1946-1488
DOI10.23919/FPL.2017.8056771

Cover

More Information
Summary:A pre-trained convolutional deep neural network (CNN) is widely used for embedded systems, which requires highly power-and-area efficiency. In that case, the CPU is too slow, the embedded GPU dissipates much power, and the ASIC cannot keep up with the rapidly progress of the CNN variations. This paper uses a binarized CNN which treats only binary 2-values for the inputs and the weights. Since the multiplier is replaced into an XNOR circuit, we can realize a high-performance MAC circuit by using many XNOR circuits. In the paper, we eliminate internal FC layers excluding the last one, then, insert a binarized average pooling layer, which can be realized by a majority circuit for binarized (1/0) values. In that case, since the weight memory is replaced into the 1's counter, we can realize a compact and faster CNN than the conventional ones. We implemented the VGG-11 benchmark CNN for the CIFAR10 image classification task on the Xilinx Inc. Zedboard. Compared with the conventional binarized implementations on an FPGA, the classification accuracy was almost the same, the performance per power efficiency is 5.1 better, as for the performance per area efficiency, it is 8.0 times better, and as for the performance per memory, it is 8.2 times better.
ISSN:1946-1488
DOI:10.23919/FPL.2017.8056771