Optimal Compression for Two-Field Entries in Fixed-Width Memories
Data compression is a well-studied (and well-solved) problem in the setup of long coding blocks. But important emerging applications need to compress data to memory words of small fixed widths. This new setup is the subject of this paper. In the problem we consider, we have two sources with known di...
Saved in:
| Published in | IEEE transactions on information theory Vol. 64; no. 6; pp. 4309 - 4322 |
|---|---|
| Main Authors | , |
| Format | Journal Article |
| Language | English |
| Published |
New York
IEEE
01.06.2018
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects | |
| Online Access | Get full text |
| ISSN | 0018-9448 1557-9654 |
| DOI | 10.1109/TIT.2018.2820688 |
Cover
| Summary: | Data compression is a well-studied (and well-solved) problem in the setup of long coding blocks. But important emerging applications need to compress data to memory words of small fixed widths. This new setup is the subject of this paper. In the problem we consider, we have two sources with known discrete distributions, and we wish to find codes that maximize the success probability that the two source outputs are represented in <inline-formula> <tex-math notation="LaTeX">L </tex-math></inline-formula> bits or less. A good practical use for this problem is a table with two-field entries that is stored in a memory of a fixed width <inline-formula> <tex-math notation="LaTeX">L </tex-math></inline-formula>. Such tables of very large sizes are common in network switches/routers and in data-intensive machine-learning applications. After defining the problem formally, we solve it optimally with an efficient code-design algorithm. We also solve the problem in the more constrained case where a single code is used in both fields (to save space for storing code dictionaries). For both code-design problems we find decompositions that yield efficient dynamic-programming algorithms. With the help of an empirical study we show the success probabilities of the optimal codes for different distributions and memory widths. In particular, this paper demonstrates the superiority of the new codes over existing compression algorithms. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 0018-9448 1557-9654 |
| DOI: | 10.1109/TIT.2018.2820688 |