Optimal Compression for Two-Field Entries in Fixed-Width Memories

Data compression is a well-studied (and well-solved) problem in the setup of long coding blocks. But important emerging applications need to compress data to memory words of small fixed widths. This new setup is the subject of this paper. In the problem we consider, we have two sources with known di...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on information theory Vol. 64; no. 6; pp. 4309 - 4322
Main Authors	Rottenstreich, Ori, Cassuto, Yuval
Format	Journal Article
Language	English
Published	New York IEEE 01.06.2018 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Codes Compression algorithms Data compression Decoding Dictionaries fixed-width memories Heuristic algorithms Huffman coding Machine learning network switches and routers Optimization Random access memory Routers Switches table compression Tables
Online Access	Get full text
ISSN	0018-9448 1557-9654
DOI	10.1109/TIT.2018.2820688

Cover

More Information
Summary:	Data compression is a well-studied (and well-solved) problem in the setup of long coding blocks. But important emerging applications need to compress data to memory words of small fixed widths. This new setup is the subject of this paper. In the problem we consider, we have two sources with known discrete distributions, and we wish to find codes that maximize the success probability that the two source outputs are represented in <inline-formula> <tex-math notation="LaTeX">L </tex-math></inline-formula> bits or less. A good practical use for this problem is a table with two-field entries that is stored in a memory of a fixed width <inline-formula> <tex-math notation="LaTeX">L </tex-math></inline-formula>. Such tables of very large sizes are common in network switches/routers and in data-intensive machine-learning applications. After defining the problem formally, we solve it optimally with an efficient code-design algorithm. We also solve the problem in the more constrained case where a single code is used in both fields (to save space for storing code dictionaries). For both code-design problems we find decompositions that yield efficient dynamic-programming algorithms. With the help of an empirical study we show the success probabilities of the optimal codes for different distributions and memory widths. In particular, this paper demonstrates the superiority of the new codes over existing compression algorithms.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0018-9448 1557-9654
DOI:	10.1109/TIT.2018.2820688