Deep learning assisted quality ranking for list decoding of videos subject to transmission errors

In this paper, we propose a new deep learning-based quality ranking framework to assist video list decoding methods in the context of unreliable video transmissions. The objective is to identify an intact image (corrected video frame) among a list of candidate images generated by a list decoding met...

Full description

Saved in:

Bibliographic Details
Published in	IEEE International Conference on Wireless and Mobile Computing, Networking, and Communications (Print) pp. 135 - 142
Main Authors	Guichemerre, Alexis, Coulombe, Stephane, Trioux, Anthony, Coudoux, Francois-Xavier, Corlay, Patrick
Format	Conference Proceeding
Language	English
Published	IEEE 21.06.2023
Subjects	Convolutional Neural Network (CNN) Decoding Distance measurement Distortion H.265 Image quality List Decoding Neural networks Non-uniform Distortions Training transmission errors Video Quality Wireless communication wireless communications
Online Access	Get full text
ISSN	2160-4894
DOI	10.1109/WiMob58348.2023.10187827

Cover

More Information
Summary:	In this paper, we propose a new deep learning-based quality ranking framework to assist video list decoding methods in the context of unreliable video transmissions. The objective is to identify an intact image (corrected video frame) among a list of candidate images generated by a list decoding method, where all candidates, except for the intact image are corrupted. The framework comprises a deep learning-based no-reference image quality assessment (NR-IQA) for non-uniform video distortions (NUD) system to rank the candidate images according to their quality, which allows identifying the best one. To show the validity of our proposed framework, we develop an NR-IQA system relying on a proven patch-based convolutional neural network (CNN) architecture, which we adapt to better account for the non-uniform distortions observed in the candidate images, e.g., H.265 transmission errors during wireless communications. Specifically, we modify the patch size on which our CNN for non-uniform distortions (CNN-NUD) operates to capture a larger and more meaningful spatial context. Moreover, we develop a new training database using images resulting from various bit modifications in the received video packets, to simulate the list decoding process, and train the system using a full reference IQA (FR-IQA) method. Experiments on intra frames of videos encoded using H.265 show the ability of this system to identify an intact image among a set of five candidate images with an average accuracy of 96.6%, whereas traditional NR-IQA metrics or the initially trained CNN system offer poor accuracy ranging between 15.7% and 33.6%, respectively.
ISSN:	2160-4894
DOI:	10.1109/WiMob58348.2023.10187827