KBStyle: Fast Style Transfer Using a 200 KB Network With Symmetric Knowledge Distillation

Convolutional Neural Networks (CNNs) have achieved remarkable progress in arbitrary artistic style transfer. However, the model size of existing state-of-the-art (SOTA) style transfer algorithms is immense, leading to enormous computational costs and memory demand. It makes real-time and high resolu...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on image processing Vol. 33; pp. 82 - 94
Main Authors	Chen, Wenshu, Huang, Yujie, Wang, Mingyu, Wu, Xiaolin, Zeng, Xiaoyang
Format	Journal Article
Language	English
Published	United States IEEE 2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Artificial neural networks Computational modeling Computing costs Decoding Distillation efficient AI algorithm Electronic devices Encoders-Decoders Feature extraction Image coding Image quality Image resolution knowledge distillation Knowledge engineering model compression Style transfer Task analysis
Online Access	Get full text
ISSN	1057-7149 1941-0042 1941-0042
DOI	10.1109/TIP.2023.3335828

Cover

More Information
Summary:	Convolutional Neural Networks (CNNs) have achieved remarkable progress in arbitrary artistic style transfer. However, the model size of existing state-of-the-art (SOTA) style transfer algorithms is immense, leading to enormous computational costs and memory demand. It makes real-time and high resolution hard for GPUs with limited memory and limits the application on mobile devices. This paper proposes a novel arbitrary artistic style transfer algorithm, KBStyle, whose model size is only 200 KB. Firstly, we design a style transfer network where the style encoder, content encoder, and corresponding decoder are custom designed to guarantee low computational cost and high shape retention. Besides, the weighted style loss function is presented to improve the performance of style migration. Then, we propose a novel knowledge distillation method (Symmetric Knowledge Distillation, SKD) for encoder-decoder-based style transfer models, which redefines the knowledge and symmetrically compresses the encoder and decoder. With the SKD, the proposed style transfer network is further compressed by 14 times to achieve the KBStyle. Experimental results demonstrate that the proposed SKD method achieves comparable results with other SOTA knowledge distillation algorithms for style transfer. Besides, the proposed KBStyle achieves high-quality stylized images. And the inference time of the KBStyle on an Nvidia TITAN RTX GPU is only 20 ms when the resolutions of the content image and style image are both 2k-resolution (<inline-formula> <tex-math notation="LaTeX">2048\times 1080 </tex-math></inline-formula>). Moreover, the 200 KB model size of KBStyle is much smaller than the SOTA models and facilitates style transfer on mobile devices.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1057-7149 1941-0042 1941-0042
DOI:	10.1109/TIP.2023.3335828