Fine-grained video super-resolution via spatial-temporal learning and image detail enhancement
This paper addresses the problem for fine-grained video super-resolution (FGVSR) to suppress temporal flickering caused by separately processed consecutive frames and enhance the quality of restored video frame details when upsizing videos. Some existing video SR methods fail to sufficiently utilize...
Saved in:
Published in | Engineering applications of artificial intelligence Vol. 131; p. 107789 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier Ltd
01.05.2024
|
Subjects | |
Online Access | Get full text |
ISSN | 0952-1976 1873-6769 |
DOI | 10.1016/j.engappai.2023.107789 |
Cover
Summary: | This paper addresses the problem for fine-grained video super-resolution (FGVSR) to suppress temporal flickering caused by separately processed consecutive frames and enhance the quality of restored video frame details when upsizing videos. Some existing video SR methods fail to sufficiently utilize spatial-temporal information from input low-resolution (LR) videos, while others may generate undesirable artifacts or cannot well reconstruct image details. To overcome these problems, we present a novel deep learning framework for FGVSR, which takes a set of consecutive LR video frames and generate the corresponding super-resolved frames. Our deep FGVSR framework focuses on reconstructing missing information from the LR sources based on the proposed multi-frame alignment and refinement strategies. More specifically, we propose an alignment module, where multiple frames are aligned at feature level, to prevent the output videos from flickering. Then, we introduce a feature fusion module, where aligned features generated from our alignment module are fused and refined in a multi-scale manner. Finally, the proposed refinement module is used to reconstruct missing information based on the fused features. In addition, we also embed an image enhancement module on the skip connection from the input layer to the output layer of our network for further enhancing the SR results. Experimental results show that the proposed deep FGVSR, compared with existing deep learning-based VSR methods, achieves state-of-the-art performances on the three well-known benchmarks, including REDS, Vid4, and Vimeo90k. More specifically, compared with the state-of-the-art VSR methods in our experiments, our FGVSR achieves quantitative improvements from 0.70 dB to 9.54 dB in PSNR. On the other hand, our method has also been shown to be efficient to other image restoration tasks, such as image inpainting.
•End-to-end trainable deep learning-based fine grained video super-resolution model is proposed.•The temporal dependency is implicitly obtained without performing explicit motion compensation.•Image details are reconstructed and further sharpened by our refinement and enhancement modules.•Adaptively integrated into other deep image restoration model for further boosting performances. |
---|---|
ISSN: | 0952-1976 1873-6769 |
DOI: | 10.1016/j.engappai.2023.107789 |