Bayesian asymmetric quantized neural networks

•M-ary quantized neural network is proposed with adjustable M to balance between end performance and implementation cost.•Bayesian quantization is implemented to assure robustness in quantization learning.•A new multi-spike-and-slab prior is discovered to represent asymmetric parameters in Bayesian...

Full description

Saved in:

Bibliographic Details
Published in	Pattern recognition Vol. 139; p. 109463
Main Authors	Chien, Jen-Tzung, Chang, Su-Ting
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.07.2023
Subjects	Bayesian asymmetric quantization Binary neural network Model compression Quantized neural network Quantized neural network Binary neural network Model compression Bayesian asymmetric quantization
Online Access	Get full text
ISSN	0031-3203 1873-5142 1873-5142
DOI	10.1016/j.patcog.2023.109463

Cover

More Information
Summary:	•M-ary quantized neural network is proposed with adjustable M to balance between end performance and implementation cost.•Bayesian quantization is implemented to assure robustness in quantization learning.•A new multi-spike-and-slab prior is discovered to represent asymmetric parameters in Bayesian quantization.•Joint training of neural networks for compression and classification is developed.•State-of-the-art performance is achieved for image recognition in various competing tasks. This paper develops a robust model compression for neural networks via parameter quantization. Traditionally, quantized neural networks (QNN) were constructed by binary or ternary weights where the weights were deterministic. This paper generalizes QNN in two directions. First, M-ary QNN is developed to adjust the balance between memory storage and model capacity. The representation values and the quantization partitions in M-ary quantization are mutually estimated to enhance the resolution of gradients in neural network training. A flexible quantization with asymmetric partitions is formulated. Second, the variational inference is incorporated to implement the Bayesian asymmetric QNN. The uncertainty of weights is faithfully represented to enhance the robustness of the trained model in presence of heterogeneous data. Importantly, the multiple spike-and-slab prior is proposed to represent the quantization levels in Bayesian asymmetric learning. M-ary quantization is then optimized by maximizing the evidence lower bound of classification network. An adaptive parameter space is built to implement Bayesian quantization and neural representation. The experiments on various image recognition tasks show that M-ary QNN achieves similar performance as the full-precision neural network (FPNN), but the memory cost and the test time are significantly reduced relative to FPNN. The merit of Bayesian M-ary QNN using multiple spike-and-slab prior is investigated.
ISSN:	0031-3203 1873-5142 1873-5142
DOI:	10.1016/j.patcog.2023.109463