Reinforcement Learning Accelerator using Q-Learning Algorithm with Optimized Bit Precision

This paper presents a Reinforcement Learning (RL) accelerator using Q-learning algorithm with optimized bit precision. In this work, we perform evaluation of the employed bit width of the data path subject to accuracy of the Q-values. The designed RL accelerator is implementing the Q-Learning algori...

Full description

Saved in:
Bibliographic Details
Published in2022 8th International Conference on Wireless and Telematics (ICWT) pp. 1 - 5
Main Authors Sutisna, Nana, Ilmy, Andi M. R., Setiawan, Handi Nugroho, Syafalni, Infall, Mulyawan, Rahmat, Ahmadi, Nur, Adiono, Trio
Format Conference Proceeding
LanguageEnglish
Published IEEE 21.07.2022
Subjects
Online AccessGet full text
DOI10.1109/ICWT55831.2022.9935371

Cover

More Information
Summary:This paper presents a Reinforcement Learning (RL) accelerator using Q-learning algorithm with optimized bit precision. In this work, we perform evaluation of the employed bit width of the data path subject to accuracy of the Q-values. The designed RL accelerator is implementing the Q-Learning algorithm that comprises several blocks: Q-Value memories, Q-Updater, Policy Generator and, Environment block. In addition, we also present the corresponding architecture and implement the design in the FPGA. Experimental results show that the number of bits can be reduced from 32 bits to 16 bits without sacrificing the accuracy. The accuracy can be maintained at around 88% when employing 16 bits data path with 10 bits fraction. Moreover, the designed 16 bits RL accelerator design size offers reduction of LUTs and FFs compared to 32 bits implementation by around 40% and 14 %, respectively. Hence, the optimized accelerator can be useful for low-complexity systems or limited resources such as in robot automation for smart navigation and smart mapping.
DOI:10.1109/ICWT55831.2022.9935371