Workload Change Point Detection for Runtime Thermal Management of Embedded Systems

Applications executed on multicore embedded systems interact with system software [such as the operating system (OS)] and hardware, leading to widely varying thermal profiles which accelerate some aging mechanisms, reducing the lifetime reliability. Effectively managing the temperature therefore req...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on computer-aided design of integrated circuits and systems Vol. 35; no. 8; pp. 1358 - 1371
Main Authors Das, Anup, Merrett, Geoff V., Tribastone, Mirco, Al-Hashimi, Bashir M.
Format Journal Article
LanguageEnglish
Published New York IEEE 01.08.2016
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN0278-0070
1937-4151
DOI10.1109/TCAD.2015.2504875

Cover

More Information
Summary:Applications executed on multicore embedded systems interact with system software [such as the operating system (OS)] and hardware, leading to widely varying thermal profiles which accelerate some aging mechanisms, reducing the lifetime reliability. Effectively managing the temperature therefore requires: 1) autonomous detection of changes in application workload and 2) appropriate selection of control levers to manage thermal profiles of these workloads. In this paper, we propose a technique for workload change detection using density ratio-based statistical divergence between overlapping sliding windows of CPU performance statistics. This is integrated in a runtime approach for thermal management, which uses reinforcement learning to select workload-specific thermal control levers by sampling on-board thermal sensors. Identified control levers override the OSs native thread allocation decision and scale hardware voltage-frequency to improve average temperature, peak temperature, and thermal cycling. The proposed approach is validated through its implementation as a hierarchical runtime manager for Linux, with heuristic-based thread affinity selected from the upper hierarchy to reduce thermal cycling and learningbased voltage-frequency selected from the lower hierarchy to reduce average and peak temperatures. Experiments conducted with mobile, embedded, and high performance applications on ARM-based embedded systems demonstrate that the proposed approach increases workload change detection accuracy by an average 3.4×, reducing the average temperature by 4 °C-25 °C, peak temperature by 6 °C-24 °C, and thermal cycling by 7%-35% over state-of-the-art approaches.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0278-0070
1937-4151
DOI:10.1109/TCAD.2015.2504875