Optimal Compression for Two-Field Entries in Fixed-Width Memories

Data compression is a well-studied (and well-solved) problem in the setup of long coding blocks. But important emerging applications need to compress data to memory words of small fixed widths. This new setup is the subject of this paper. In the problem we consider, we have two sources with known di...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on information theory Vol. 64; no. 6; pp. 4309 - 4322
Main Authors Rottenstreich, Ori, Cassuto, Yuval
Format Journal Article
LanguageEnglish
Published New York IEEE 01.06.2018
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN0018-9448
1557-9654
DOI10.1109/TIT.2018.2820688

Cover

More Information
Summary:Data compression is a well-studied (and well-solved) problem in the setup of long coding blocks. But important emerging applications need to compress data to memory words of small fixed widths. This new setup is the subject of this paper. In the problem we consider, we have two sources with known discrete distributions, and we wish to find codes that maximize the success probability that the two source outputs are represented in <inline-formula> <tex-math notation="LaTeX">L </tex-math></inline-formula> bits or less. A good practical use for this problem is a table with two-field entries that is stored in a memory of a fixed width <inline-formula> <tex-math notation="LaTeX">L </tex-math></inline-formula>. Such tables of very large sizes are common in network switches/routers and in data-intensive machine-learning applications. After defining the problem formally, we solve it optimally with an efficient code-design algorithm. We also solve the problem in the more constrained case where a single code is used in both fields (to save space for storing code dictionaries). For both code-design problems we find decompositions that yield efficient dynamic-programming algorithms. With the help of an empirical study we show the success probabilities of the optimal codes for different distributions and memory widths. In particular, this paper demonstrates the superiority of the new codes over existing compression algorithms.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0018-9448
1557-9654
DOI:10.1109/TIT.2018.2820688