Analysis of Job Metadata for Enhanced Wall Time Prediction

For efficient utilization of large-scale HPC systems, the task of resource management and job scheduling is of highest priority. Therefore, modern job scheduling systems require information about the estimated total wall time of the jobs already at submission time. Proper wall time estimates are a k...

Full description

Saved in:
Bibliographic Details
Published inJob Scheduling Strategies for Parallel Processing Vol. 11332; pp. 1 - 14
Main Authors Soysal, Mehmet, Berghoff, Marco, Streit, Achim
Format Book Chapter
LanguageEnglish
Published Switzerland Springer International Publishing AG 2019
Springer International Publishing
SeriesLecture Notes in Computer Science
Online AccessGet full text
ISBN9783030106317
3030106314
ISSN0302-9743
1611-3349
DOI10.1007/978-3-030-10632-4_1

Cover

More Information
Summary:For efficient utilization of large-scale HPC systems, the task of resource management and job scheduling is of highest priority. Therefore, modern job scheduling systems require information about the estimated total wall time of the jobs already at submission time. Proper wall time estimates are a key for reliable scheduling decisions. Typically, users specify these estimates, already at submission time, based on either previous knowledge or certain limits given by the system. Real-world experience shows that user given estimates are far away from accurate. Hence, an automated system is desirable that creates more precise wall time estimates of submitted jobs. In this paper, we investigate different job metadata and their impact on the wall time prediction. For the job wall time prediction, we used machine learning methods and the workload traces of large HPC systems. In contrast to previous work, we also consider the jobname and in particular the submission directory. Our evaluation shows that we can better predict the accuracy of jobs per user by a factor of seven than most users, without any in-depth analysis of the job.
ISBN:9783030106317
3030106314
ISSN:0302-9743
1611-3349
DOI:10.1007/978-3-030-10632-4_1