Fast Algorithms for Convolutional Neural Networks

Deep convolutional neural networks take GPU-days of computation to train on large data sets. Pedestrian detection for self driving cars requires very low latency. Image recognition for mobile phones is constrained by limited processing resources. The success of convolutional neural networks in these...

Full description

Saved in:

Bibliographic Details
Published in	2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 4013 - 4021
Main Authors	Lavin, Andrew, Gray, Scott
Format	Conference Proceeding
Language	English
Published	IEEE 01.06.2016
Subjects	Complexity theory Convolution Image recognition Laplace equations Neural networks Transforms Two dimensional displays
Online Access	Get full text
ISSN	1063-6919
DOI	10.1109/CVPR.2016.435

Cover

More Information
Summary:	Deep convolutional neural networks take GPU-days of computation to train on large data sets. Pedestrian detection for self driving cars requires very low latency. Image recognition for mobile phones is constrained by limited processing resources. The success of convolutional neural networks in these situations is limited by how fast we can compute them. Conventional FFT based convolution is fast for large filters, but state of the art convolutional neural networks use small, 3 3 filters. We introduce a new class of fast algorithms for convolutional neural networks using Winograd's minimal filtering algorithms. The algorithms compute minimal complexity convolution over small tiles, which makes them fast with small filters and small batch sizes. We benchmark a GPU implementation of our algorithm with the VGG network and show state of the art throughput at batch sizes from 1 to 64.
ISSN:	1063-6919
DOI:	10.1109/CVPR.2016.435