Deep Learning in Reconfigurable Hardware: A Survey

Deep Learning has been applied successfully to solve complex problems that involves analysis of large data sets and these good results can be directly related to the size and complexity of the networks and training algorithms. However, these structures are considerably resource-consuming and demand...

Full description

Saved in:

Bibliographic Details
Published in	2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) pp. 95 - 98
Main Authors	Dias, Mauricio A., Ferreira, Daniel A.P.
Format	Conference Proceeding
Language	English
Published	IEEE 01.05.2019
Subjects	Biological neural networks Convolution Deep Larning Deep learning Field programmable gate arrays FPGA GPU Hardware Measurement Optimization Reconfigurable Computing
Online Access	Get full text
DOI	10.1109/IPDPSW.2019.00026

Cover

More Information
Summary:	Deep Learning has been applied successfully to solve complex problems that involves analysis of large data sets and these good results can be directly related to the size and complexity of the networks and training algorithms. However, these structures are considerably resource-consuming and demand extra effort to be used on embedded systems. Researchers have chosen different alternatives to execute deep learning algorithms as servers, clusters, GPUs and FPGAs. Specific hardware structures have been designed to solve these problems in different platforms such as FPGAs and even ASICs. Although there are surveys about this subject, they do not present a clear criteria to real implementations, detailed explanation of design techniques and applied metrics. Presenting a different approach, this work analyzes reconfigurable hardware based structures designed to optimize deep learning algorithms. This work includes results of hardware accelerators implemented and tested using commercially available development boards avoiding simulation-based results.Results of the proposed analysis showed that a lot of effort is directed to this subject but presented results are far from what is expected for the performance of a hardware structure for deep learning.
DOI:	10.1109/IPDPSW.2019.00026