Improving Cache Power and Performance Using Deterministic Naps and Early Miss Detection

Cache memory systems consume a significant portion of static and dynamic power consumption in processors. Similarly, the access latency through the cache memory system significantly impacts the overall processor performance. Several techniques have been proposed to tackle the individual power or per...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on multi-scale computing systems Vol. 1; no. 3; pp. 150 - 158
Main Authors	Olorode, Oluleye D., Nourani, Mehrdad
Format	Journal Article
Language	English
Published	IEEE 01.07.2015
Subjects	Cache Cache memory Decoding Deterministic Naps Dynamic & Static Power Fast Access Memory Program processors Random access memory Solid modeling fast access memory dynamic & static power deterministic naps Cache
Online Access	Get full text
ISSN	2332-7766 2332-7766
DOI	10.1109/TMSCS.2015.2494043

Cover

More Information
Summary:	Cache memory systems consume a significant portion of static and dynamic power consumption in processors. Similarly, the access latency through the cache memory system significantly impacts the overall processor performance. Several techniques have been proposed to tackle the individual power or performance. However, almost all trade off performance for power or vice versa. We propose a novel scheme that improves performance while reducing both static and dynamic power with minimal area overhead. Our proposed scheme reduces dynamic power by using a hash-based mechanism to minimize the number of cache lines read during program execution. This is achieved by identifying and not reading those that are guaranteed non-matches (i.e., cache misses) to a new access. Performance improvement occurs when all cache lines of a referenced set are determined non-matches to the requested address, and therefore skip a few cache pipe stages as guaranteed misses. Static power savings is achieved by exploiting in-flight cache access information to deterministically lower the power state of cache lines that are guaranteed not to be accessed in the immediate future. These techniques easily integrate into existing cache architectures and were evaluated using widely known CAD tools and benchmarks. We have observed up to 92, 17, and 2 percent improvements in performance, static, and dynamic power, respectively, with less than 3 percent area overhead.
ISSN:	2332-7766 2332-7766
DOI:	10.1109/TMSCS.2015.2494043