KBStyle: Fast Style Transfer Using a 200 KB Network With Symmetric Knowledge Distillation
Convolutional Neural Networks (CNNs) have achieved remarkable progress in arbitrary artistic style transfer. However, the model size of existing state-of-the-art (SOTA) style transfer algorithms is immense, leading to enormous computational costs and memory demand. It makes real-time and high resolu...
Saved in:
| Published in | IEEE transactions on image processing Vol. 33; pp. 82 - 94 |
|---|---|
| Main Authors | , , , , |
| Format | Journal Article |
| Language | English |
| Published |
United States
IEEE
2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1057-7149 1941-0042 1941-0042 |
| DOI | 10.1109/TIP.2023.3335828 |
Cover
| Summary: | Convolutional Neural Networks (CNNs) have achieved remarkable progress in arbitrary artistic style transfer. However, the model size of existing state-of-the-art (SOTA) style transfer algorithms is immense, leading to enormous computational costs and memory demand. It makes real-time and high resolution hard for GPUs with limited memory and limits the application on mobile devices. This paper proposes a novel arbitrary artistic style transfer algorithm, KBStyle, whose model size is only 200 KB. Firstly, we design a style transfer network where the style encoder, content encoder, and corresponding decoder are custom designed to guarantee low computational cost and high shape retention. Besides, the weighted style loss function is presented to improve the performance of style migration. Then, we propose a novel knowledge distillation method (Symmetric Knowledge Distillation, SKD) for encoder-decoder-based style transfer models, which redefines the knowledge and symmetrically compresses the encoder and decoder. With the SKD, the proposed style transfer network is further compressed by 14 times to achieve the KBStyle. Experimental results demonstrate that the proposed SKD method achieves comparable results with other SOTA knowledge distillation algorithms for style transfer. Besides, the proposed KBStyle achieves high-quality stylized images. And the inference time of the KBStyle on an Nvidia TITAN RTX GPU is only 20 ms when the resolutions of the content image and style image are both 2k-resolution (<inline-formula> <tex-math notation="LaTeX">2048\times 1080 </tex-math></inline-formula>). Moreover, the 200 KB model size of KBStyle is much smaller than the SOTA models and facilitates style transfer on mobile devices. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 1057-7149 1941-0042 1941-0042 |
| DOI: | 10.1109/TIP.2023.3335828 |