基于可变步长的访存延迟测量模型的研究与实现
评测访存延迟对于优化应用访存模式和数据放置有重要的指导意义,然而数据Cache、多线程、数据预取等技术却严重干扰了访存延迟测量的精度。设计并实现了基于可变步长的访存延迟测量模型,在一块空间内根据用户指定的步长创建访问序列环,循环访问这个序列得出平均时间,即为访存延迟。最后对Intel的通用处理器和飞腾处理器在不同数据大小、步长、线程数等情况下的访存延迟进行了测量比较,该模型能够显示存储层次并精确显示测量延迟。...
Saved in:
Published in | 计算机工程与科学 Vol. 36; no. 1; pp. 12 - 18 |
---|---|
Main Author | |
Format | Journal Article |
Language | Chinese |
Published |
国防科学技术大学计算机学院,湖南长沙,410073
2014
|
Subjects | |
Online Access | Get full text |
ISSN | 1007-130X |
DOI | 10.3969/j.issn.1007-130X.2014.01.003 |
Cover
Summary: | 评测访存延迟对于优化应用访存模式和数据放置有重要的指导意义,然而数据Cache、多线程、数据预取等技术却严重干扰了访存延迟测量的精度。设计并实现了基于可变步长的访存延迟测量模型,在一块空间内根据用户指定的步长创建访问序列环,循环访问这个序列得出平均时间,即为访存延迟。最后对Intel的通用处理器和飞腾处理器在不同数据大小、步长、线程数等情况下的访存延迟进行了测量比较,该模型能够显示存储层次并精确显示测量延迟。 |
---|---|
Bibliography: | 43-1258/TP Evaluating the memory access latency has important significance for optimizing application patterns and data placement. However, cache, multi-threading, data prefetching and other techniques have serious interference with the accuracy of measurement of memory access latency. A measurement model based on variable strides is designed and implemented. According to user-specified strides, we create a sequence ring in a space, and circularly access this ring to obtain the average time as the memory access latency. Finally, we measure an Intel common processor and FT processor's memory latency by different data size, stride and thread, and make the data contrast with each other. This model can dis- play the memory hierarchy and display memory latency precisely. MAO Xi-long, YANG An, LU Gao-feng, LIN Qi, CHENG Hui (College of Computer, National University of Defense Technology, Changsha 410073, China) Memory latency variable stride measurement SMT multi-core processor FT Processor |
ISSN: | 1007-130X |
DOI: | 10.3969/j.issn.1007-130X.2014.01.003 |