An SMDP approach for Reinforcement Learning in HPC cluster schedulers

Deep reinforcement learning applied to computing systems has shown potential for improving system performance, as well as faster discovery of better allocation strategies. In this paper, we map HPC batch job scheduling to the SMDP formalism, and present an online, deep reinforcement learning-based s...

Full description

Saved in:
Bibliographic Details
Published inFuture generation computer systems Vol. 139; pp. 239 - 252
Main Authors de Freitas Cunha, Renato Luiz, Chaimowicz, Luiz
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.02.2023
Subjects
Online AccessGet full text
ISSN0167-739X
1872-7115
DOI10.1016/j.future.2022.09.025

Cover

Abstract Deep reinforcement learning applied to computing systems has shown potential for improving system performance, as well as faster discovery of better allocation strategies. In this paper, we map HPC batch job scheduling to the SMDP formalism, and present an online, deep reinforcement learning-based solution that uses a modification of the Proximal Policy Optimization algorithm for minimizing job slowdown with action masking, supporting large action spaces. In our experiments, we assess the effects of noise in run time estimates in our model, evaluating how it behaves in small (64 processors) and large (16384 processors) clusters. We also show our model is robust to changes in workload and in cluster sizes, showing transfer works with changes of cluster size of up to 10×, and changes from synthetic workload generators to supercomputing workload traces. In our experiments, the proposed model outperforms learning models from the literature and classic heuristics, making it a viable modeling approach for robust, transferable, learning scheduling models. •An SMDP formulation of HPC job scheduling that minimizes average bounded slowdown.•Evaluation of the effects of noise in run time estimates in small and large clusters.•Evaluation of the SMDP model with traces from real supercomputing workloads.•A comparison of agent performance with models from the literature.
AbstractList Deep reinforcement learning applied to computing systems has shown potential for improving system performance, as well as faster discovery of better allocation strategies. In this paper, we map HPC batch job scheduling to the SMDP formalism, and present an online, deep reinforcement learning-based solution that uses a modification of the Proximal Policy Optimization algorithm for minimizing job slowdown with action masking, supporting large action spaces. In our experiments, we assess the effects of noise in run time estimates in our model, evaluating how it behaves in small (64 processors) and large (16384 processors) clusters. We also show our model is robust to changes in workload and in cluster sizes, showing transfer works with changes of cluster size of up to 10×, and changes from synthetic workload generators to supercomputing workload traces. In our experiments, the proposed model outperforms learning models from the literature and classic heuristics, making it a viable modeling approach for robust, transferable, learning scheduling models. •An SMDP formulation of HPC job scheduling that minimizes average bounded slowdown.•Evaluation of the effects of noise in run time estimates in small and large clusters.•Evaluation of the SMDP model with traces from real supercomputing workloads.•A comparison of agent performance with models from the literature.
Author de Freitas Cunha, Renato Luiz
Chaimowicz, Luiz
Author_xml – sequence: 1
  givenname: Renato Luiz
  orcidid: 0000-0002-3196-3008
  surname: de Freitas Cunha
  fullname: de Freitas Cunha, Renato Luiz
  email: renatoc@ufmg.br
  organization: Programa de Pós Graduação em Ciência da Computação, Av. Antônio Carlos, 6627, Belo Horizonte, 31270-901, MG, Brazil
– sequence: 2
  givenname: Luiz
  surname: Chaimowicz
  fullname: Chaimowicz, Luiz
  organization: Programa de Pós Graduação em Ciência da Computação, Av. Antônio Carlos, 6627, Belo Horizonte, 31270-901, MG, Brazil
BookMark eNqFkM1Kw0AUhQepYFt9AxfzAonzk2QSF0Kp1QoViz_gbriZ3Ngp6aTMJIJvb0pdudDVOYv7HbjfhIxc65CQS85iznh2tY3rvus9xoIJEbMiZiI9IWOeKxEpztMRGQ9nKlKyeD8jkxC2jDGuJB-TxczRl8fbNYX93rdgNrRuPX1G64Y0uEPX0RWCd9Z9UOvocj2npulDh54Gs8Gqb9CHc3JaQxPw4ien5O1u8TpfRqun-4f5bBUZIbMuKrMykSUkmVElU4mEJIfUAComU5kLECoTUAwFslRJo3LOQJVSitQUVSW5nJLkuGt8G4LHWu-93YH_0pzpgwq91UcV-qBCs0IPKgbs-hdmbAedbV3nwTb_wTdHGIfHPi16HYxFZ7CyHk2nq9b-PfAN-MN-Wg
CitedBy_id crossref_primary_10_1016_j_future_2025_107760
crossref_primary_10_1007_s40747_023_01322_x
Cites_doi 10.1109/MASCOTS50786.2020.9285940
10.1145/3005745.3005750
10.1109/65.844498
10.1162/neco.1994.6.2.215
10.1016/j.future.2016.08.010
10.1145/3126908.3126955
10.1126/science.aar6404
10.1109/ACCESS.2019.2902846
10.1016/j.engappai.2019.02.013
10.1145/2740070.2626334
10.1145/3524114
10.1145/3150224
10.1016/j.jpdc.2004.06.008
10.1016/S0004-3702(99)00052-1
10.1016/S0743-7315(03)00108-4
10.1109/71.932708
10.1145/3341302.3342080
ContentType Journal Article
Copyright 2022 Elsevier B.V.
Copyright_xml – notice: 2022 Elsevier B.V.
DBID AAYXX
CITATION
DOI 10.1016/j.future.2022.09.025
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1872-7115
EndPage 252
ExternalDocumentID 10_1016_j_future_2022_09_025
S0167739X22003090
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
1B1
1~.
1~5
29H
4.4
457
4G.
5GY
5VS
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABFNM
ABJNI
ABMAC
ABXDB
ABYKQ
ACDAQ
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMUD
AEBSH
AEKER
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BKOJK
BLXMC
CS3
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
G8K
GBLVA
GBOLZ
HLZ
HVGLF
HZ~
IHE
J1W
KOM
LG9
M41
MO0
MS~
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
PC.
Q38
R2-
RIG
ROL
RPZ
SBC
SDF
SDG
SES
SEW
SPC
SPCBC
SSV
SSZ
T5K
UHS
WUQ
XPP
ZMT
~G-
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABWVN
ACLOT
ACRPL
ADNMO
AEIPS
AFJKZ
AGQPQ
AIIUN
ANKPU
APXCP
CITATION
EFKBS
~HD
ID FETCH-LOGICAL-c236t-b6b43ba46c7b0743a48a5cae7035382a2762a982aa6573c7810a7b3325c9dd313
IEDL.DBID .~1
ISSN 0167-739X
IngestDate Thu Oct 02 04:28:45 EDT 2025
Thu Apr 24 22:51:49 EDT 2025
Fri Feb 23 02:41:44 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Deep Reinforcement Learning
Scheduling
Simulation
Semi-Markov Decision Processes
Machine Learning
Workload traces
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c236t-b6b43ba46c7b0743a48a5cae7035382a2762a982aa6573c7810a7b3325c9dd313
ORCID 0000-0002-3196-3008
PageCount 14
ParticipantIDs crossref_primary_10_1016_j_future_2022_09_025
crossref_citationtrail_10_1016_j_future_2022_09_025
elsevier_sciencedirect_doi_10_1016_j_future_2022_09_025
PublicationCentury 2000
PublicationDate February 2023
2023-02-00
PublicationDateYYYYMMDD 2023-02-01
PublicationDate_xml – month: 02
  year: 2023
  text: February 2023
PublicationDecade 2020
PublicationTitle Future generation computer systems
PublicationYear 2023
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Wang, Liu, Zheng, Xia, Li, Chen, Guo, Xie (b11) 2019; 7
D. Carastan-Santos, R.Y. De Camargo, Obtaining dynamic scheduling policies with simulation and machine learning, in: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2017, pp. 1–13.
Rodrigues, Cunha, Netto, Spriggs (b16) 2016
H. Mao, M. Schwarzkopf, S.B. Venkatakrishnan, Z. Meng, M. Alizadeh, Learning scheduling algorithms for data processing clusters, in: Proceedings of the ACM Special Interest Group on Data Communication, 2019, pp. 270–288.
Arlitt, Jin (b21) 2000; 14
de Freitas Cunha, Chaimowicz (b24) 2021
Tesauro (b27) 1994; 6
Huang, Ontañón (b6) 2020
Fan, Lan (b15) 2019
Fan, Rich, Allcock, Papka, Lan (b17) 2017
Fan, Lan, Childers, Rich, Allcock, Papka (b3) 2021
Grandl, Ananthanarayanan, Kandula, Rao, Akella (b41) 2014; 44
Feitelson (b26) 2001
Zotkin, Keleher (b32) 1999
Sutton, Precup, Singh (b36) 1999; 112
Hotovy (b40) 1996
Smith, Taylor, Foster (b18) 1999
Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin (b43) 2017; 30
Feitelson, Rudolph (b25) 1996
Lu, Yu, Pan (b13) 2022
Brockman, Cheung, Pettersson, Schneider, Schulman, Tang, Zaremba (b22) 2016
Netto, Calheiros, Rodrigues, Cunha, Buyya (b2) 2018; 51
Smith, Foster, Taylor (b19) 2004; 64
Silver, Hubert, Schrittwieser, Antonoglou, Lai, Guez, Lanctot, Sifre, Kumaran, Graepel (b28) 2018; 362
Cunha, Rodrigues, Tizzei, Netto (b1) 2017; 67
Schulman, Wolski, Dhariwal, Radford, Klimov (b38) 2017
Zhang, Dai, He, Bao, Xie (b9) 2020
Chiang, Arpaci-Dusseau, Vernon (b34) 2002
Fox, Glazier, Kadupitiya, Jadhao, Kim, Qiu, Sluka, Somogyi, Marathe, Adiga (b4) 2019
Liang, Machado, Talvitie, Bowling (b29) 2016
H. Mao, M. Alizadeh, I. Menache, S. Kandula, Resource management with deep reinforcement learning, in: Proceedings of the 15th ACM Workshop on Hot Topics in Networks, 2016, pp. 50–56.
Baydin, Pearlmutter, Radul, Siskind (b37) 2018; 18
Tsafrir, Etsion, Feitelson (b35) 2005
G. Domeniconi, E.K. Lee, A. Morari, CuSH: Cognitive ScHeduler for Heterogeneous High Performance Computing System, in: Proceedings of DRL4KDD 19: Workshop on Deep Reinforcement Learning for Knowledge Discovery, DRL4KDD, 2019.
Kintsakis, Psomopoulos, Mitkas (b12) 2019; 81
Lublin, Feitelson (b30) 2003; 63
Raffin, Hill, Gleave, Kanervisto, Ernestus, Dormann (b23) 2021
Tang, Lan, Desai, Buettner, Yu (b39) 2011
Xu, Song, Wu, Gill, Ye, Xu (b20) 2022
Mu’alem, Feitelson (b33) 2001; 12
R.L. de Freitas Cunha, L. Chaimowicz, Towards a common environment for learning scheduling algorithms, in: 2020 28th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS, 2020, pp. 1–8.
Lee, Schwartzman, Hardy, Snavely (b31) 2004
M. Hausknecht, P. Stone, Deep recurrent q-learning for partially observable mdps, in: 2015 AAAI Fall Symposium Series, 2015.
Huang (10.1016/j.future.2022.09.025_b6) 2020
Xu (10.1016/j.future.2022.09.025_b20) 2022
Arlitt (10.1016/j.future.2022.09.025_b21) 2000; 14
Kintsakis (10.1016/j.future.2022.09.025_b12) 2019; 81
Sutton (10.1016/j.future.2022.09.025_b36) 1999; 112
Schulman (10.1016/j.future.2022.09.025_b38) 2017
10.1016/j.future.2022.09.025_b5
Lublin (10.1016/j.future.2022.09.025_b30) 2003; 63
Zotkin (10.1016/j.future.2022.09.025_b32) 1999
Fan (10.1016/j.future.2022.09.025_b15) 2019
10.1016/j.future.2022.09.025_b10
Feitelson (10.1016/j.future.2022.09.025_b25) 1996
10.1016/j.future.2022.09.025_b14
Smith (10.1016/j.future.2022.09.025_b19) 2004; 64
Cunha (10.1016/j.future.2022.09.025_b1) 2017; 67
Silver (10.1016/j.future.2022.09.025_b28) 2018; 362
Baydin (10.1016/j.future.2022.09.025_b37) 2018; 18
Zhang (10.1016/j.future.2022.09.025_b9) 2020
Lu (10.1016/j.future.2022.09.025_b13) 2022
Vaswani (10.1016/j.future.2022.09.025_b43) 2017; 30
Mu’alem (10.1016/j.future.2022.09.025_b33) 2001; 12
Lee (10.1016/j.future.2022.09.025_b31) 2004
Hotovy (10.1016/j.future.2022.09.025_b40) 1996
Tesauro (10.1016/j.future.2022.09.025_b27) 1994; 6
Tang (10.1016/j.future.2022.09.025_b39) 2011
Smith (10.1016/j.future.2022.09.025_b18) 1999
Liang (10.1016/j.future.2022.09.025_b29) 2016
Fan (10.1016/j.future.2022.09.025_b3) 2021
de Freitas Cunha (10.1016/j.future.2022.09.025_b24) 2021
10.1016/j.future.2022.09.025_b7
10.1016/j.future.2022.09.025_b42
Netto (10.1016/j.future.2022.09.025_b2) 2018; 51
10.1016/j.future.2022.09.025_b8
Raffin (10.1016/j.future.2022.09.025_b23) 2021
Feitelson (10.1016/j.future.2022.09.025_b26) 2001
Rodrigues (10.1016/j.future.2022.09.025_b16) 2016
Fan (10.1016/j.future.2022.09.025_b17) 2017
Brockman (10.1016/j.future.2022.09.025_b22) 2016
Tsafrir (10.1016/j.future.2022.09.025_b35) 2005
Wang (10.1016/j.future.2022.09.025_b11) 2019; 7
Grandl (10.1016/j.future.2022.09.025_b41) 2014; 44
Chiang (10.1016/j.future.2022.09.025_b34) 2002
Fox (10.1016/j.future.2022.09.025_b4) 2019
References_xml – start-page: 103
  year: 2002
  end-page: 127
  ident: b34
  article-title: The impact of more accurate requested runtimes on production job scheduling performance
  publication-title: Workshop on Job Scheduling Strategies for Parallel Processing
– start-page: 303
  year: 2022
  end-page: 310
  ident: b13
  article-title: Reinforcement learning-based auto-scaling algorithm for elastic cloud workflow service
  publication-title: Parallel and Distributed Computing, Applications and Technologies
– start-page: 236
  year: 1999
  end-page: 243
  ident: b32
  article-title: Job-length estimation and performance in backfilling schedulers
  publication-title: Proceedings. the Eighth International Symposium on High Performance Distributed Computing (Cat. No. 99TH8469)
– reference: M. Hausknecht, P. Stone, Deep recurrent q-learning for partially observable mdps, in: 2015 AAAI Fall Symposium Series, 2015.
– year: 2021
  ident: b23
  article-title: Stable-Baselines3: Reliable reinforcement learning implementations
  publication-title: J. Mach. Learn. Res.
– volume: 6
  start-page: 215
  year: 1994
  end-page: 219
  ident: b27
  article-title: TD-Gammon, a self-teaching backgammon program, achieves master-level play
  publication-title: Neural Comput.
– volume: 30
  year: 2017
  ident: b43
  article-title: Attention is all you need
  publication-title: Adv. Neural Inf. Process. Syst.
– start-page: 6
  year: 2016
  end-page: 13
  ident: b16
  article-title: Helping HPC users specify job memory requirements via machine learning
  publication-title: 2016 Third International Workshop on HPC User Support Tools
– volume: 67
  start-page: 35
  year: 2017
  end-page: 46
  ident: b1
  article-title: Job placement advisor based on turnaround predictions for HPC hybrid clouds
  publication-title: Future Gener. Comput. Syst.
– reference: R.L. de Freitas Cunha, L. Chaimowicz, Towards a common environment for learning scheduling algorithms, in: 2020 28th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS, 2020, pp. 1–8.
– volume: 18
  start-page: 1
  year: 2018
  end-page: 43
  ident: b37
  article-title: Automatic differentiation in machine learning: A survey
  publication-title: J. Mach. Learn. Res.
– start-page: 253
  year: 2004
  end-page: 263
  ident: b31
  article-title: Are user runtime estimates inherently inaccurate?
  publication-title: Workshop on Job Scheduling Strategies for Parallel Processing
– year: 2019
  ident: b15
  article-title: Exploiting multi-resource scheduling for HPC
  publication-title: SC Poster
– start-page: 1
  year: 2020
  end-page: 15
  ident: b9
  article-title: RLScheduler: An automated HPC batch job scheduler using reinforcement learning
  publication-title: SC20: International Conference for High Performance Computing, Networking, Storage and Analysis
– start-page: 27
  year: 1996
  end-page: 40
  ident: b40
  article-title: Workload evolution on the Cornell theory center IBM SP2
  publication-title: Workshop on Job Scheduling Strategies for Parallel Processing
– year: 2022
  ident: b20
  article-title: EsDNN: Deep neural network based multivariate workload prediction in cloud computing environments
  publication-title: ACM Trans. Internet Technol.
– start-page: 188
  year: 2001
  end-page: 205
  ident: b26
  article-title: Metrics for parallel job scheduling and their convergence
  publication-title: Workshop on Job Scheduling Strategies for Parallel Processing
– start-page: 202
  year: 1999
  end-page: 219
  ident: b18
  article-title: Using run-time predictions to estimate queue wait times and improve scheduler performance
  publication-title: Workshop on Job Scheduling Strategies for Parallel Processing
– start-page: 79
  year: 2021
  end-page: 93
  ident: b24
  article-title: On the impact of MDP design for reinforcement learning agents in resource management
  publication-title: Brazilian Conference on Intelligent Systems
– volume: 362
  start-page: 1140
  year: 2018
  end-page: 1144
  ident: b28
  article-title: A general reinforcement learning algorithm that masters chess, shogi, and go through self-play
  publication-title: Science
– volume: 7
  start-page: 39974
  year: 2019
  end-page: 39982
  ident: b11
  article-title: Multi-objective workflow scheduling with deep-Q-network-based multi-agent reinforcement learning
  publication-title: IEEE Access
– start-page: 485
  year: 2016
  end-page: 493
  ident: b29
  article-title: State of the art control of atari games using shallow reinforcement learning
  publication-title: AAMAS
– start-page: 530
  year: 2017
  end-page: 540
  ident: b17
  article-title: Trade-off between prediction accuracy and underestimation rate in job runtime estimates
  publication-title: 2017 IEEE International Conference on Cluster Computing
– reference: D. Carastan-Santos, R.Y. De Camargo, Obtaining dynamic scheduling policies with simulation and machine learning, in: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2017, pp. 1–13.
– volume: 112
  start-page: 181
  year: 1999
  end-page: 211
  ident: b36
  article-title: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
  publication-title: Artificial Intelligence
– year: 2017
  ident: b38
  article-title: Proximal policy optimization algorithms
– start-page: 1
  year: 2005
  end-page: 35
  ident: b35
  article-title: Modeling user runtime estimates
  publication-title: Workshop on Job Scheduling Strategies for Parallel Processing
– reference: H. Mao, M. Alizadeh, I. Menache, S. Kandula, Resource management with deep reinforcement learning, in: Proceedings of the 15th ACM Workshop on Hot Topics in Networks, 2016, pp. 50–56.
– volume: 51
  start-page: 1
  year: 2018
  end-page: 29
  ident: b2
  article-title: HPC cloud for scientific and business applications: Taxonomy, vision, and research challenges
  publication-title: ACM Comput. Surv.
– volume: 81
  start-page: 94
  year: 2019
  end-page: 106
  ident: b12
  article-title: Reinforcement learning based scheduling in a workflow management system
  publication-title: Eng. Appl. Artif. Intell.
– year: 2021
  ident: b3
  article-title: Deep reinforcement agent for scheduling in HPC
– start-page: 828
  year: 2011
  end-page: 839
  ident: b39
  article-title: Reducing fragmentation on torus-connected supercomputers
  publication-title: 2011 IEEE International Parallel & Distributed Processing Symposium
– year: 2020
  ident: b6
  article-title: A closer look at invalid action masking in policy gradient algorithms
– reference: H. Mao, M. Schwarzkopf, S.B. Venkatakrishnan, Z. Meng, M. Alizadeh, Learning scheduling algorithms for data processing clusters, in: Proceedings of the ACM Special Interest Group on Data Communication, 2019, pp. 270–288.
– start-page: 1
  year: 1996
  end-page: 26
  ident: b25
  article-title: Toward convergence in job schedulers for parallel supercomputers
  publication-title: Workshop on Job Scheduling Strategies for Parallel Processing
– volume: 63
  start-page: 1105
  year: 2003
  end-page: 1122
  ident: b30
  article-title: The workload on parallel supercomputers: Modeling the characteristics of rigid jobs
  publication-title: J. Parallel Distrib. Comput.
– volume: 12
  start-page: 529
  year: 2001
  end-page: 543
  ident: b33
  article-title: Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling
  publication-title: IEEE Trans. Parallel Distrib. Syst.
– volume: 14
  start-page: 30
  year: 2000
  end-page: 37
  ident: b21
  article-title: A workload characterization study of the 1998 world cup web site
  publication-title: IEEE Netw.
– year: 2016
  ident: b22
  article-title: Openai gym
– start-page: 422
  year: 2019
  end-page: 429
  ident: b4
  article-title: Learning everywhere: Pervasive machine learning for effective high-performance computation
  publication-title: 2019 IEEE International Parallel and Distributed Processing Symposium Workshops
– volume: 44
  start-page: 455
  year: 2014
  end-page: 466
  ident: b41
  article-title: Multi-resource packing for cluster schedulers
  publication-title: ACM SIGCOMM Comput. Commun. Rev.
– volume: 64
  start-page: 1007
  year: 2004
  end-page: 1016
  ident: b19
  article-title: Predicting application run times with historical information
  publication-title: J. Parallel Distrib. Comput.
– reference: G. Domeniconi, E.K. Lee, A. Morari, CuSH: Cognitive ScHeduler for Heterogeneous High Performance Computing System, in: Proceedings of DRL4KDD 19: Workshop on Deep Reinforcement Learning for Knowledge Discovery, DRL4KDD, 2019.
– ident: 10.1016/j.future.2022.09.025_b8
  doi: 10.1109/MASCOTS50786.2020.9285940
– start-page: 103
  year: 2002
  ident: 10.1016/j.future.2022.09.025_b34
  article-title: The impact of more accurate requested runtimes on production job scheduling performance
– volume: 30
  year: 2017
  ident: 10.1016/j.future.2022.09.025_b43
  article-title: Attention is all you need
  publication-title: Adv. Neural Inf. Process. Syst.
– start-page: 530
  year: 2017
  ident: 10.1016/j.future.2022.09.025_b17
  article-title: Trade-off between prediction accuracy and underestimation rate in job runtime estimates
– ident: 10.1016/j.future.2022.09.025_b5
  doi: 10.1145/3005745.3005750
– start-page: 485
  year: 2016
  ident: 10.1016/j.future.2022.09.025_b29
  article-title: State of the art control of atari games using shallow reinforcement learning
– volume: 14
  start-page: 30
  issue: 3
  year: 2000
  ident: 10.1016/j.future.2022.09.025_b21
  article-title: A workload characterization study of the 1998 world cup web site
  publication-title: IEEE Netw.
  doi: 10.1109/65.844498
– year: 2019
  ident: 10.1016/j.future.2022.09.025_b15
  article-title: Exploiting multi-resource scheduling for HPC
  publication-title: SC Poster
– start-page: 253
  year: 2004
  ident: 10.1016/j.future.2022.09.025_b31
  article-title: Are user runtime estimates inherently inaccurate?
– ident: 10.1016/j.future.2022.09.025_b7
– volume: 6
  start-page: 215
  issue: 2
  year: 1994
  ident: 10.1016/j.future.2022.09.025_b27
  article-title: TD-Gammon, a self-teaching backgammon program, achieves master-level play
  publication-title: Neural Comput.
  doi: 10.1162/neco.1994.6.2.215
– volume: 67
  start-page: 35
  year: 2017
  ident: 10.1016/j.future.2022.09.025_b1
  article-title: Job placement advisor based on turnaround predictions for HPC hybrid clouds
  publication-title: Future Gener. Comput. Syst.
  doi: 10.1016/j.future.2016.08.010
– start-page: 422
  year: 2019
  ident: 10.1016/j.future.2022.09.025_b4
  article-title: Learning everywhere: Pervasive machine learning for effective high-performance computation
– start-page: 6
  year: 2016
  ident: 10.1016/j.future.2022.09.025_b16
  article-title: Helping HPC users specify job memory requirements via machine learning
– start-page: 1
  year: 1996
  ident: 10.1016/j.future.2022.09.025_b25
  article-title: Toward convergence in job schedulers for parallel supercomputers
– ident: 10.1016/j.future.2022.09.025_b14
  doi: 10.1145/3126908.3126955
– start-page: 202
  year: 1999
  ident: 10.1016/j.future.2022.09.025_b18
  article-title: Using run-time predictions to estimate queue wait times and improve scheduler performance
– volume: 362
  start-page: 1140
  issue: 6419
  year: 2018
  ident: 10.1016/j.future.2022.09.025_b28
  article-title: A general reinforcement learning algorithm that masters chess, shogi, and go through self-play
  publication-title: Science
  doi: 10.1126/science.aar6404
– volume: 7
  start-page: 39974
  year: 2019
  ident: 10.1016/j.future.2022.09.025_b11
  article-title: Multi-objective workflow scheduling with deep-Q-network-based multi-agent reinforcement learning
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2019.2902846
– volume: 81
  start-page: 94
  year: 2019
  ident: 10.1016/j.future.2022.09.025_b12
  article-title: Reinforcement learning based scheduling in a workflow management system
  publication-title: Eng. Appl. Artif. Intell.
  doi: 10.1016/j.engappai.2019.02.013
– volume: 18
  start-page: 1
  year: 2018
  ident: 10.1016/j.future.2022.09.025_b37
  article-title: Automatic differentiation in machine learning: A survey
  publication-title: J. Mach. Learn. Res.
– year: 2020
  ident: 10.1016/j.future.2022.09.025_b6
– start-page: 188
  year: 2001
  ident: 10.1016/j.future.2022.09.025_b26
  article-title: Metrics for parallel job scheduling and their convergence
– ident: 10.1016/j.future.2022.09.025_b42
– volume: 44
  start-page: 455
  issue: 4
  year: 2014
  ident: 10.1016/j.future.2022.09.025_b41
  article-title: Multi-resource packing for cluster schedulers
  publication-title: ACM SIGCOMM Comput. Commun. Rev.
  doi: 10.1145/2740070.2626334
– start-page: 1
  year: 2020
  ident: 10.1016/j.future.2022.09.025_b9
  article-title: RLScheduler: An automated HPC batch job scheduler using reinforcement learning
– year: 2016
  ident: 10.1016/j.future.2022.09.025_b22
– year: 2022
  ident: 10.1016/j.future.2022.09.025_b20
  article-title: EsDNN: Deep neural network based multivariate workload prediction in cloud computing environments
  publication-title: ACM Trans. Internet Technol.
  doi: 10.1145/3524114
– year: 2021
  ident: 10.1016/j.future.2022.09.025_b23
  article-title: Stable-Baselines3: Reliable reinforcement learning implementations
  publication-title: J. Mach. Learn. Res.
– volume: 51
  start-page: 1
  issue: 1
  year: 2018
  ident: 10.1016/j.future.2022.09.025_b2
  article-title: HPC cloud for scientific and business applications: Taxonomy, vision, and research challenges
  publication-title: ACM Comput. Surv.
  doi: 10.1145/3150224
– volume: 64
  start-page: 1007
  issue: 9
  year: 2004
  ident: 10.1016/j.future.2022.09.025_b19
  article-title: Predicting application run times with historical information
  publication-title: J. Parallel Distrib. Comput.
  doi: 10.1016/j.jpdc.2004.06.008
– year: 2021
  ident: 10.1016/j.future.2022.09.025_b3
– start-page: 236
  year: 1999
  ident: 10.1016/j.future.2022.09.025_b32
  article-title: Job-length estimation and performance in backfilling schedulers
– volume: 112
  start-page: 181
  issue: 1
  year: 1999
  ident: 10.1016/j.future.2022.09.025_b36
  article-title: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
  publication-title: Artificial Intelligence
  doi: 10.1016/S0004-3702(99)00052-1
– volume: 63
  start-page: 1105
  issue: 11
  year: 2003
  ident: 10.1016/j.future.2022.09.025_b30
  article-title: The workload on parallel supercomputers: Modeling the characteristics of rigid jobs
  publication-title: J. Parallel Distrib. Comput.
  doi: 10.1016/S0743-7315(03)00108-4
– volume: 12
  start-page: 529
  issue: 6
  year: 2001
  ident: 10.1016/j.future.2022.09.025_b33
  article-title: Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling
  publication-title: IEEE Trans. Parallel Distrib. Syst.
  doi: 10.1109/71.932708
– ident: 10.1016/j.future.2022.09.025_b10
  doi: 10.1145/3341302.3342080
– start-page: 303
  year: 2022
  ident: 10.1016/j.future.2022.09.025_b13
  article-title: Reinforcement learning-based auto-scaling algorithm for elastic cloud workflow service
– start-page: 828
  year: 2011
  ident: 10.1016/j.future.2022.09.025_b39
  article-title: Reducing fragmentation on torus-connected supercomputers
– year: 2017
  ident: 10.1016/j.future.2022.09.025_b38
– start-page: 27
  year: 1996
  ident: 10.1016/j.future.2022.09.025_b40
  article-title: Workload evolution on the Cornell theory center IBM SP2
– start-page: 79
  year: 2021
  ident: 10.1016/j.future.2022.09.025_b24
  article-title: On the impact of MDP design for reinforcement learning agents in resource management
– start-page: 1
  year: 2005
  ident: 10.1016/j.future.2022.09.025_b35
  article-title: Modeling user runtime estimates
SSID ssj0001731
Score 2.3853617
Snippet Deep reinforcement learning applied to computing systems has shown potential for improving system performance, as well as faster discovery of better allocation...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 239
SubjectTerms Deep Reinforcement Learning
Machine Learning
Scheduling
Semi-Markov Decision Processes
Simulation
Workload traces
Title An SMDP approach for Reinforcement Learning in HPC cluster schedulers
URI https://dx.doi.org/10.1016/j.future.2022.09.025
Volume 139
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier)
  customDbUrl:
  eissn: 1872-7115
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001731
  issn: 0167-739X
  databaseCode: GBLVA
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier SD Complete Freedom Collection [SCCMFC]
  customDbUrl:
  eissn: 1872-7115
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001731
  issn: 0167-739X
  databaseCode: ACRLP
  dateStart: 19950201
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection
  customDbUrl:
  eissn: 1872-7115
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001731
  issn: 0167-739X
  databaseCode: .~1
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: ScienceDirect Journal Collection
  customDbUrl:
  eissn: 1872-7115
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001731
  issn: 0167-739X
  databaseCode: AIKHN
  dateStart: 19950201
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1bS8MwFA5jvvjiXbyOPPga1yZpsz6OuVGVjeEc7C3kNpmMOmR79bd70qZeQBR8KyWBcsj5zpf05PsQutLMMmETQyLNOOGQi0S72BJ_3JABAZinkb-cPByl-ZTfzZJZA_XquzC-rTJgf4XpJVqHN-0QzfZqsWhPfAO9YNmM0vJ_nt-3cy68i8H122ebRyyCJyEAgh9dX58re7wq3Q7YJVJaqp16w-yfytOXkjPYQzuBK-Ju9Tn7qOGKA7Rb-zDgkJaHqN8t8GR4M8a1QDgGJoofXCmKasrzPxx0VJ_wosD5uIfNcuMlEjBsbqHYLIEEHqHpoP_Yy0mwRyCGsnRNdKo504qnRmhPBBTvqMQoBzkMKEYVBZxTGTyoNBHMiE4cKaEZo4nJrGUxO0bN4qVwJwjPoSjZyLgM6AGfAylhVpnEscRaM7dcnCJWR0WaoB3uLSyWsm4Se5ZVLKWPpYwyCbE8ReRj1qrSzvhjvKgDLr-tAQnw_uvMs3_PPEfb3kC-6sO-QM3168ZdAs1Y61a5jlpoq3t7n4_eAdxV0Vw
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEF6KHvTiW6zPPXiNTfaRbY6ltlRtS7Et9LbsI5FKiUXaq7_d2WTjA0TBWwg7EIadb77ZzH6D0LWmlgrLTRBqygIGsRjoNLKBO25IgABkceguJw-GcW_K7md8VkPt6i6Ma6v02F9ieoHW_k3De7OxnM8bY9dAL2gyI6T4nwd1-ybjRLgK7Obts88jEn4oISCCW17dnyuavErhDigTCSnkTt3E7J_y05ec091DO54s4lb5PfuoluYHaLcaxIB9XB6iTivH48HtCFcK4RioKH5MC1VUUxwAYi-k-oTnOe6N2tgs1k4jAUN1C9lmASzwCE27nUm7F_j5CIEhNF4FOtaMasViI7RjAoo1FTcqhSAGGCOKANCpBB5UzAU1ohmFSmhKCTeJtTSix2gjf8nTE4QzyEo2NGkC_IBlwEqoVYanlFtrMstEHdHKK9J48XA3w2Ihqy6xZ1n6UjpfyjCR4Ms6Cj6slqV4xh_rReVw-W0TSMD3Xy1P_215hbZ6k0Ff9u-GD2do202TL5uyz9HG6nWdXgDnWOnLYk-9A-GQ0vE
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+SMDP+approach+for+Reinforcement+Learning+in+HPC+cluster+schedulers&rft.jtitle=Future+generation+computer+systems&rft.au=de+Freitas+Cunha%2C+Renato+Luiz&rft.au=Chaimowicz%2C+Luiz&rft.date=2023-02-01&rft.pub=Elsevier+B.V&rft.issn=0167-739X&rft.eissn=1872-7115&rft.volume=139&rft.spage=239&rft.epage=252&rft_id=info:doi/10.1016%2Fj.future.2022.09.025&rft.externalDocID=S0167739X22003090
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0167-739X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0167-739X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0167-739X&client=summon