Improving Cache Power and Performance Using Deterministic Naps and Early Miss Detection

Cache memory systems consume a significant portion of static and dynamic power consumption in processors. Similarly, the access latency through the cache memory system significantly impacts the overall processor performance. Several techniques have been proposed to tackle the individual power or per...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on multi-scale computing systems Vol. 1; no. 3; pp. 150 - 158
Main Authors Olorode, Oluleye D., Nourani, Mehrdad
Format Journal Article
LanguageEnglish
Published IEEE 01.07.2015
Subjects
Online AccessGet full text
ISSN2332-7766
2332-7766
DOI10.1109/TMSCS.2015.2494043

Cover

More Information
Summary:Cache memory systems consume a significant portion of static and dynamic power consumption in processors. Similarly, the access latency through the cache memory system significantly impacts the overall processor performance. Several techniques have been proposed to tackle the individual power or performance. However, almost all trade off performance for power or vice versa. We propose a novel scheme that improves performance while reducing both static and dynamic power with minimal area overhead. Our proposed scheme reduces dynamic power by using a hash-based mechanism to minimize the number of cache lines read during program execution. This is achieved by identifying and not reading those that are guaranteed non-matches (i.e., cache misses) to a new access. Performance improvement occurs when all cache lines of a referenced set are determined non-matches to the requested address, and therefore skip a few cache pipe stages as guaranteed misses. Static power savings is achieved by exploiting in-flight cache access information to deterministically lower the power state of cache lines that are guaranteed not to be accessed in the immediate future. These techniques easily integrate into existing cache architectures and were evaluated using widely known CAD tools and benchmarks. We have observed up to 92, 17, and 2 percent improvements in performance, static, and dynamic power, respectively, with less than 3 percent area overhead.
ISSN:2332-7766
2332-7766
DOI:10.1109/TMSCS.2015.2494043