基于流访问特征的多级硬件预取

硬件数据预取技术将处理器可能访问的数据提前装入Cache中,使得处理器访存时尽量命中Cache,提升系统性能。但现有研究和应用主要对一级Cache进行预取,预取的数据可能在使用前无法及时装入Cache,从而降低硬件预取对系统性能的提升效果。针对上述问题,以流访问特征的预取为基础,提出一种同时对多级Cache进行预取的方法,并对流访问特征的预取进行实现。基于SPEC CPU2000测试程序集的实验结果表明,与仅对一级Cache进行预取相比,对多级Cache同时进行预取可以将整数程序的性能平均提升2.11%,最高提升11.19%,浮点程序的性能平均提升3.08%,最高提升12.77%。...

Full description

Saved in:
Bibliographic Details
Published in计算机工程 Vol. 42; no. 1; pp. 51 - 55
Main Author 贾迅 翁志强 胡向东
Format Journal Article
LanguageChinese
Published 上海高性能集成电路设计中心,上海,201204 2016
Subjects
Online AccessGet full text
ISSN1000-3428
DOI10.3969/j.issn.1000-3428.2016.01.010

Cover

More Information
Summary:硬件数据预取技术将处理器可能访问的数据提前装入Cache中,使得处理器访存时尽量命中Cache,提升系统性能。但现有研究和应用主要对一级Cache进行预取,预取的数据可能在使用前无法及时装入Cache,从而降低硬件预取对系统性能的提升效果。针对上述问题,以流访问特征的预取为基础,提出一种同时对多级Cache进行预取的方法,并对流访问特征的预取进行实现。基于SPEC CPU2000测试程序集的实验结果表明,与仅对一级Cache进行预取相比,对多级Cache同时进行预取可以将整数程序的性能平均提升2.11%,最高提升11.19%,浮点程序的性能平均提升3.08%,最高提升12.77%。
Bibliography:Technique of hardware data prefetching loads data to Cache before they are actually referenced by processor,thus improves the system performance.Existing research and application focus on prefetching data only to the first level Cache.In this way,the prefetched data may arrive at Cache after they are used and decrease the performance of hardware data prefetching.This paper proposes a technique of prefetching multiple levels of Caches based on a stream-based prefetcher.Performance analysis for SPEC CPU2000 shows that the proposed technique can improve the performance of integer applications by 2.11%on average,11.19%at most,float applications 3.08%on average,12.77%at most,compared with only prefetching the first level Cache.
31-1289/TP
memory wall; stream access; processor; multiple level Cache; hardware prefetching
JIA Xun,WENG Zhiqiang, HU Xiangdong (Shanghai High Performance IC Design Center, Shanghai 201204, China)
ISSN:1000-3428
DOI:10.3969/j.issn.1000-3428.2016.01.010