Parallel design for error-resilient entropy coding algorithm on GPU

The error-resilient entropy coding (EREC) algorithm is an effective method for combating error propagation at low cost in many compression methods using variable-length coding (VLC). However, the main drawback of the EREC is its high complexity. In order to overcome this disadvantage, a parallel ERE...

Full description

Saved in:
Bibliographic Details
Published inJournal of parallel and distributed computing Vol. 73; no. 4; pp. 411 - 419
Main Authors Dai, Yuan, Fang, Yong, He, Dongjian, Huang, Bormin
Format Journal Article
LanguageEnglish
Published Amsterdam Elsevier Inc 01.04.2013
Elsevier
Subjects
Online AccessGet full text
ISSN0743-7315
1096-0848
DOI10.1016/j.jpdc.2012.12.008

Cover

More Information
Summary:The error-resilient entropy coding (EREC) algorithm is an effective method for combating error propagation at low cost in many compression methods using variable-length coding (VLC). However, the main drawback of the EREC is its high complexity. In order to overcome this disadvantage, a parallel EREC is implemented on a graphics processing unit (GPU) using the NVIDIA CUDA technology. The original EREC is a finer-grained parallel at each stage which brings additional communication overhead. To achieve high efficiency of parallel EREC, we propose partitioning the EREC (P-EREC) algorithm, which splits variable-length blocks into groups and then every group is coded using the EREC separately. Each GPU thread processes one group so as to make the EREC coarse-grained parallel. In addition, some optimization strategies are discussed in order to obtain higher performance using the GPU. In the case that the variable-length data blocks are divided into 128 groups (256 groups, resp.), experimental results show that the parallel P-EREC achieves 32× to 123× (54× to 350×, resp.) speedup over the original C code of EREC compiled with the O2 optimization option. Higher speedup can even be obtained with more groups. Compared to the EREC, the P-EREC not only achieves a good speedup performance, but it also slightly improves the resilience of the VLC bit-stream against burst or random errors. ► We attempt to optimize the performance of the EREC by parallel processing. ► A parallel EREC is implemented on a graphics processing unit (GPU). ► We propose partitioning the EREC (P-EREC) algorithm. ► We implemented the parallel P-EREC on GPU and optimized techniques were fully used. ► The parallel P-EREC gains 32× to 123× speedup compared with the original C code.
ISSN:0743-7315
1096-0848
DOI:10.1016/j.jpdc.2012.12.008