Fine-Grain Energy Consumption Modeling of HPC Task-Based Programs

The power consumption of supercomputers is and will be a major concern in the future. Therefore, reducing the power consumption of high performance computing (HPC) applications is mandatory. Monitoring the energy consumption of HPC programs is a good first step: using external or software power mete...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings / IEEE International Conference on Cluster Computing pp. 1 - 12
Main Authors	Risse, Jules, Guermouche, Amina, Trahay, Francois
Format	Conference Proceeding
Language	English
Published	IEEE 02.09.2025
Subjects	Central Processing Unit Energy consumption Energy efficiency Energy measurement Energy modeling Graphics processing units High performance computing Meters Power demand Power measurement Runtime Runtime system Software Software engineering Software measurement
Online Access	Get full text
ISSN	2168-9253
DOI	10.1109/CLUSTER59342.2025.11186478

Cover

More Information
Summary:	The power consumption of supercomputers is and will be a major concern in the future. Therefore, reducing the power consumption of high performance computing (HPC) applications is mandatory. Monitoring the energy consumption of HPC programs is a good first step: using external or software power meters, one can measure the energy consumption of an entire compute node or some of its hardware components. Unfortunately, the differences in scope and time scale between power meters and code level functions prevent the identification of power hungry code blocks. For this work, we propose leveraging the tracing mechanism of the StarPU runtime system in order to estimate task level power consumption. We trace the execution of the application while regularly measuring coarse-grain energy consumption of central processing units (CPUs) and graphics processing units (GPUs) using vendor software interfaces. After execution, we identify the executed tasks on each processing unit for every coarsegrain energy measurement interval. We then use this information to generate an overdetermined linear system linking tasks and energy measurements. Subsequently, solving the system allows us to estimate the fine-grain power consumption of each task independently of its actual duration. We achieve mean average percentage errors (MAPE) ranging from 0.5 % to 5 % on various CPUs, and from 10 % to 28 % on GPUs. We show that a solution generated from a run can be used to predict the energy consumption of other runs with different scheduling policies.
ISSN:	2168-9253
DOI:	10.1109/CLUSTER59342.2025.11186478