An SMDP approach for Reinforcement Learning in HPC cluster schedulers
Deep reinforcement learning applied to computing systems has shown potential for improving system performance, as well as faster discovery of better allocation strategies. In this paper, we map HPC batch job scheduling to the SMDP formalism, and present an online, deep reinforcement learning-based s...
        Saved in:
      
    
          | Published in | Future generation computer systems Vol. 139; pp. 239 - 252 | 
|---|---|
| Main Authors | , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
            Elsevier B.V
    
        01.02.2023
     | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 0167-739X 1872-7115  | 
| DOI | 10.1016/j.future.2022.09.025 | 
Cover
| Abstract | Deep reinforcement learning applied to computing systems has shown potential for improving system performance, as well as faster discovery of better allocation strategies. In this paper, we map HPC batch job scheduling to the SMDP formalism, and present an online, deep reinforcement learning-based solution that uses a modification of the Proximal Policy Optimization algorithm for minimizing job slowdown with action masking, supporting large action spaces. In our experiments, we assess the effects of noise in run time estimates in our model, evaluating how it behaves in small (64 processors) and large (16384 processors) clusters. We also show our model is robust to changes in workload and in cluster sizes, showing transfer works with changes of cluster size of up to 10×, and changes from synthetic workload generators to supercomputing workload traces. In our experiments, the proposed model outperforms learning models from the literature and classic heuristics, making it a viable modeling approach for robust, transferable, learning scheduling models.
•An SMDP formulation of HPC job scheduling that minimizes average bounded slowdown.•Evaluation of the effects of noise in run time estimates in small and large clusters.•Evaluation of the SMDP model with traces from real supercomputing workloads.•A comparison of agent performance with models from the literature. | 
    
|---|---|
| AbstractList | Deep reinforcement learning applied to computing systems has shown potential for improving system performance, as well as faster discovery of better allocation strategies. In this paper, we map HPC batch job scheduling to the SMDP formalism, and present an online, deep reinforcement learning-based solution that uses a modification of the Proximal Policy Optimization algorithm for minimizing job slowdown with action masking, supporting large action spaces. In our experiments, we assess the effects of noise in run time estimates in our model, evaluating how it behaves in small (64 processors) and large (16384 processors) clusters. We also show our model is robust to changes in workload and in cluster sizes, showing transfer works with changes of cluster size of up to 10×, and changes from synthetic workload generators to supercomputing workload traces. In our experiments, the proposed model outperforms learning models from the literature and classic heuristics, making it a viable modeling approach for robust, transferable, learning scheduling models.
•An SMDP formulation of HPC job scheduling that minimizes average bounded slowdown.•Evaluation of the effects of noise in run time estimates in small and large clusters.•Evaluation of the SMDP model with traces from real supercomputing workloads.•A comparison of agent performance with models from the literature. | 
    
| Author | de Freitas Cunha, Renato Luiz Chaimowicz, Luiz  | 
    
| Author_xml | – sequence: 1 givenname: Renato Luiz orcidid: 0000-0002-3196-3008 surname: de Freitas Cunha fullname: de Freitas Cunha, Renato Luiz email: renatoc@ufmg.br organization: Programa de Pós Graduação em Ciência da Computação, Av. Antônio Carlos, 6627, Belo Horizonte, 31270-901, MG, Brazil – sequence: 2 givenname: Luiz surname: Chaimowicz fullname: Chaimowicz, Luiz organization: Programa de Pós Graduação em Ciência da Computação, Av. Antônio Carlos, 6627, Belo Horizonte, 31270-901, MG, Brazil  | 
    
| BookMark | eNqFkM1Kw0AUhQepYFt9AxfzAonzk2QSF0Kp1QoViz_gbriZ3Ngp6aTMJIJvb0pdudDVOYv7HbjfhIxc65CQS85iznh2tY3rvus9xoIJEbMiZiI9IWOeKxEpztMRGQ9nKlKyeD8jkxC2jDGuJB-TxczRl8fbNYX93rdgNrRuPX1G64Y0uEPX0RWCd9Z9UOvocj2npulDh54Gs8Gqb9CHc3JaQxPw4ien5O1u8TpfRqun-4f5bBUZIbMuKrMykSUkmVElU4mEJIfUAComU5kLECoTUAwFslRJo3LOQJVSitQUVSW5nJLkuGt8G4LHWu-93YH_0pzpgwq91UcV-qBCs0IPKgbs-hdmbAedbV3nwTb_wTdHGIfHPi16HYxFZ7CyHk2nq9b-PfAN-MN-Wg | 
    
| CitedBy_id | crossref_primary_10_1016_j_future_2025_107760 crossref_primary_10_1007_s40747_023_01322_x  | 
    
| Cites_doi | 10.1109/MASCOTS50786.2020.9285940 10.1145/3005745.3005750 10.1109/65.844498 10.1162/neco.1994.6.2.215 10.1016/j.future.2016.08.010 10.1145/3126908.3126955 10.1126/science.aar6404 10.1109/ACCESS.2019.2902846 10.1016/j.engappai.2019.02.013 10.1145/2740070.2626334 10.1145/3524114 10.1145/3150224 10.1016/j.jpdc.2004.06.008 10.1016/S0004-3702(99)00052-1 10.1016/S0743-7315(03)00108-4 10.1109/71.932708 10.1145/3341302.3342080  | 
    
| ContentType | Journal Article | 
    
| Copyright | 2022 Elsevier B.V. | 
    
| Copyright_xml | – notice: 2022 Elsevier B.V. | 
    
| DBID | AAYXX CITATION  | 
    
| DOI | 10.1016/j.future.2022.09.025 | 
    
| DatabaseName | CrossRef | 
    
| DatabaseTitle | CrossRef | 
    
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc | 
    
| Discipline | Computer Science | 
    
| EISSN | 1872-7115 | 
    
| EndPage | 252 | 
    
| ExternalDocumentID | 10_1016_j_future_2022_09_025 S0167739X22003090  | 
    
| GroupedDBID | --K --M -~X .DC .~1 0R~ 1B1 1~. 1~5 29H 4.4 457 4G. 5GY 5VS 7-5 71M 8P~ 9JN AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN ABBOA ABFNM ABJNI ABMAC ABXDB ABYKQ ACDAQ ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADJOM ADMUD AEBSH AEKER AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG AVWKF AXJTR AZFZN BKOJK BLXMC CS3 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-Q G8K GBLVA GBOLZ HLZ HVGLF HZ~ IHE J1W KOM LG9 M41 MO0 MS~ N9A O-L O9- OAUVE OZT P-8 P-9 PC. Q38 R2- RIG ROL RPZ SBC SDF SDG SES SEW SPC SPCBC SSV SSZ T5K UHS WUQ XPP ZMT ~G- AATTM AAXKI AAYWO AAYXX ABDPE ABWVN ACLOT ACRPL ADNMO AEIPS AFJKZ AGQPQ AIIUN ANKPU APXCP CITATION EFKBS ~HD  | 
    
| ID | FETCH-LOGICAL-c236t-b6b43ba46c7b0743a48a5cae7035382a2762a982aa6573c7810a7b3325c9dd313 | 
    
| IEDL.DBID | .~1 | 
    
| ISSN | 0167-739X | 
    
| IngestDate | Thu Oct 02 04:28:45 EDT 2025 Thu Apr 24 22:51:49 EDT 2025 Fri Feb 23 02:41:44 EST 2024  | 
    
| IsPeerReviewed | true | 
    
| IsScholarly | true | 
    
| Keywords | Deep Reinforcement Learning Scheduling Simulation Semi-Markov Decision Processes Machine Learning Workload traces  | 
    
| Language | English | 
    
| LinkModel | DirectLink | 
    
| MergedId | FETCHMERGED-LOGICAL-c236t-b6b43ba46c7b0743a48a5cae7035382a2762a982aa6573c7810a7b3325c9dd313 | 
    
| ORCID | 0000-0002-3196-3008 | 
    
| PageCount | 14 | 
    
| ParticipantIDs | crossref_primary_10_1016_j_future_2022_09_025 crossref_citationtrail_10_1016_j_future_2022_09_025 elsevier_sciencedirect_doi_10_1016_j_future_2022_09_025  | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | February 2023 2023-02-00  | 
    
| PublicationDateYYYYMMDD | 2023-02-01 | 
    
| PublicationDate_xml | – month: 02 year: 2023 text: February 2023  | 
    
| PublicationDecade | 2020 | 
    
| PublicationTitle | Future generation computer systems | 
    
| PublicationYear | 2023 | 
    
| Publisher | Elsevier B.V | 
    
| Publisher_xml | – name: Elsevier B.V | 
    
| References | Wang, Liu, Zheng, Xia, Li, Chen, Guo, Xie (b11) 2019; 7 D. Carastan-Santos, R.Y. De Camargo, Obtaining dynamic scheduling policies with simulation and machine learning, in: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2017, pp. 1–13. Rodrigues, Cunha, Netto, Spriggs (b16) 2016 H. Mao, M. Schwarzkopf, S.B. Venkatakrishnan, Z. Meng, M. Alizadeh, Learning scheduling algorithms for data processing clusters, in: Proceedings of the ACM Special Interest Group on Data Communication, 2019, pp. 270–288. Arlitt, Jin (b21) 2000; 14 de Freitas Cunha, Chaimowicz (b24) 2021 Tesauro (b27) 1994; 6 Huang, Ontañón (b6) 2020 Fan, Lan (b15) 2019 Fan, Rich, Allcock, Papka, Lan (b17) 2017 Fan, Lan, Childers, Rich, Allcock, Papka (b3) 2021 Grandl, Ananthanarayanan, Kandula, Rao, Akella (b41) 2014; 44 Feitelson (b26) 2001 Zotkin, Keleher (b32) 1999 Sutton, Precup, Singh (b36) 1999; 112 Hotovy (b40) 1996 Smith, Taylor, Foster (b18) 1999 Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin (b43) 2017; 30 Feitelson, Rudolph (b25) 1996 Lu, Yu, Pan (b13) 2022 Brockman, Cheung, Pettersson, Schneider, Schulman, Tang, Zaremba (b22) 2016 Netto, Calheiros, Rodrigues, Cunha, Buyya (b2) 2018; 51 Smith, Foster, Taylor (b19) 2004; 64 Silver, Hubert, Schrittwieser, Antonoglou, Lai, Guez, Lanctot, Sifre, Kumaran, Graepel (b28) 2018; 362 Cunha, Rodrigues, Tizzei, Netto (b1) 2017; 67 Schulman, Wolski, Dhariwal, Radford, Klimov (b38) 2017 Zhang, Dai, He, Bao, Xie (b9) 2020 Chiang, Arpaci-Dusseau, Vernon (b34) 2002 Fox, Glazier, Kadupitiya, Jadhao, Kim, Qiu, Sluka, Somogyi, Marathe, Adiga (b4) 2019 Liang, Machado, Talvitie, Bowling (b29) 2016 H. Mao, M. Alizadeh, I. Menache, S. Kandula, Resource management with deep reinforcement learning, in: Proceedings of the 15th ACM Workshop on Hot Topics in Networks, 2016, pp. 50–56. Baydin, Pearlmutter, Radul, Siskind (b37) 2018; 18 Tsafrir, Etsion, Feitelson (b35) 2005 G. Domeniconi, E.K. Lee, A. Morari, CuSH: Cognitive ScHeduler for Heterogeneous High Performance Computing System, in: Proceedings of DRL4KDD 19: Workshop on Deep Reinforcement Learning for Knowledge Discovery, DRL4KDD, 2019. Kintsakis, Psomopoulos, Mitkas (b12) 2019; 81 Lublin, Feitelson (b30) 2003; 63 Raffin, Hill, Gleave, Kanervisto, Ernestus, Dormann (b23) 2021 Tang, Lan, Desai, Buettner, Yu (b39) 2011 Xu, Song, Wu, Gill, Ye, Xu (b20) 2022 Mu’alem, Feitelson (b33) 2001; 12 R.L. de Freitas Cunha, L. Chaimowicz, Towards a common environment for learning scheduling algorithms, in: 2020 28th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS, 2020, pp. 1–8. Lee, Schwartzman, Hardy, Snavely (b31) 2004 M. Hausknecht, P. Stone, Deep recurrent q-learning for partially observable mdps, in: 2015 AAAI Fall Symposium Series, 2015. Huang (10.1016/j.future.2022.09.025_b6) 2020 Xu (10.1016/j.future.2022.09.025_b20) 2022 Arlitt (10.1016/j.future.2022.09.025_b21) 2000; 14 Kintsakis (10.1016/j.future.2022.09.025_b12) 2019; 81 Sutton (10.1016/j.future.2022.09.025_b36) 1999; 112 Schulman (10.1016/j.future.2022.09.025_b38) 2017 10.1016/j.future.2022.09.025_b5 Lublin (10.1016/j.future.2022.09.025_b30) 2003; 63 Zotkin (10.1016/j.future.2022.09.025_b32) 1999 Fan (10.1016/j.future.2022.09.025_b15) 2019 10.1016/j.future.2022.09.025_b10 Feitelson (10.1016/j.future.2022.09.025_b25) 1996 10.1016/j.future.2022.09.025_b14 Smith (10.1016/j.future.2022.09.025_b19) 2004; 64 Cunha (10.1016/j.future.2022.09.025_b1) 2017; 67 Silver (10.1016/j.future.2022.09.025_b28) 2018; 362 Baydin (10.1016/j.future.2022.09.025_b37) 2018; 18 Zhang (10.1016/j.future.2022.09.025_b9) 2020 Lu (10.1016/j.future.2022.09.025_b13) 2022 Vaswani (10.1016/j.future.2022.09.025_b43) 2017; 30 Mu’alem (10.1016/j.future.2022.09.025_b33) 2001; 12 Lee (10.1016/j.future.2022.09.025_b31) 2004 Hotovy (10.1016/j.future.2022.09.025_b40) 1996 Tesauro (10.1016/j.future.2022.09.025_b27) 1994; 6 Tang (10.1016/j.future.2022.09.025_b39) 2011 Smith (10.1016/j.future.2022.09.025_b18) 1999 Liang (10.1016/j.future.2022.09.025_b29) 2016 Fan (10.1016/j.future.2022.09.025_b3) 2021 de Freitas Cunha (10.1016/j.future.2022.09.025_b24) 2021 10.1016/j.future.2022.09.025_b7 10.1016/j.future.2022.09.025_b42 Netto (10.1016/j.future.2022.09.025_b2) 2018; 51 10.1016/j.future.2022.09.025_b8 Raffin (10.1016/j.future.2022.09.025_b23) 2021 Feitelson (10.1016/j.future.2022.09.025_b26) 2001 Rodrigues (10.1016/j.future.2022.09.025_b16) 2016 Fan (10.1016/j.future.2022.09.025_b17) 2017 Brockman (10.1016/j.future.2022.09.025_b22) 2016 Tsafrir (10.1016/j.future.2022.09.025_b35) 2005 Wang (10.1016/j.future.2022.09.025_b11) 2019; 7 Grandl (10.1016/j.future.2022.09.025_b41) 2014; 44 Chiang (10.1016/j.future.2022.09.025_b34) 2002 Fox (10.1016/j.future.2022.09.025_b4) 2019  | 
    
| References_xml | – start-page: 103 year: 2002 end-page: 127 ident: b34 article-title: The impact of more accurate requested runtimes on production job scheduling performance publication-title: Workshop on Job Scheduling Strategies for Parallel Processing – start-page: 303 year: 2022 end-page: 310 ident: b13 article-title: Reinforcement learning-based auto-scaling algorithm for elastic cloud workflow service publication-title: Parallel and Distributed Computing, Applications and Technologies – start-page: 236 year: 1999 end-page: 243 ident: b32 article-title: Job-length estimation and performance in backfilling schedulers publication-title: Proceedings. the Eighth International Symposium on High Performance Distributed Computing (Cat. No. 99TH8469) – reference: M. Hausknecht, P. Stone, Deep recurrent q-learning for partially observable mdps, in: 2015 AAAI Fall Symposium Series, 2015. – year: 2021 ident: b23 article-title: Stable-Baselines3: Reliable reinforcement learning implementations publication-title: J. Mach. Learn. Res. – volume: 6 start-page: 215 year: 1994 end-page: 219 ident: b27 article-title: TD-Gammon, a self-teaching backgammon program, achieves master-level play publication-title: Neural Comput. – volume: 30 year: 2017 ident: b43 article-title: Attention is all you need publication-title: Adv. Neural Inf. Process. Syst. – start-page: 6 year: 2016 end-page: 13 ident: b16 article-title: Helping HPC users specify job memory requirements via machine learning publication-title: 2016 Third International Workshop on HPC User Support Tools – volume: 67 start-page: 35 year: 2017 end-page: 46 ident: b1 article-title: Job placement advisor based on turnaround predictions for HPC hybrid clouds publication-title: Future Gener. Comput. Syst. – reference: R.L. de Freitas Cunha, L. Chaimowicz, Towards a common environment for learning scheduling algorithms, in: 2020 28th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS, 2020, pp. 1–8. – volume: 18 start-page: 1 year: 2018 end-page: 43 ident: b37 article-title: Automatic differentiation in machine learning: A survey publication-title: J. Mach. Learn. Res. – start-page: 253 year: 2004 end-page: 263 ident: b31 article-title: Are user runtime estimates inherently inaccurate? publication-title: Workshop on Job Scheduling Strategies for Parallel Processing – year: 2019 ident: b15 article-title: Exploiting multi-resource scheduling for HPC publication-title: SC Poster – start-page: 1 year: 2020 end-page: 15 ident: b9 article-title: RLScheduler: An automated HPC batch job scheduler using reinforcement learning publication-title: SC20: International Conference for High Performance Computing, Networking, Storage and Analysis – start-page: 27 year: 1996 end-page: 40 ident: b40 article-title: Workload evolution on the Cornell theory center IBM SP2 publication-title: Workshop on Job Scheduling Strategies for Parallel Processing – year: 2022 ident: b20 article-title: EsDNN: Deep neural network based multivariate workload prediction in cloud computing environments publication-title: ACM Trans. Internet Technol. – start-page: 188 year: 2001 end-page: 205 ident: b26 article-title: Metrics for parallel job scheduling and their convergence publication-title: Workshop on Job Scheduling Strategies for Parallel Processing – start-page: 202 year: 1999 end-page: 219 ident: b18 article-title: Using run-time predictions to estimate queue wait times and improve scheduler performance publication-title: Workshop on Job Scheduling Strategies for Parallel Processing – start-page: 79 year: 2021 end-page: 93 ident: b24 article-title: On the impact of MDP design for reinforcement learning agents in resource management publication-title: Brazilian Conference on Intelligent Systems – volume: 362 start-page: 1140 year: 2018 end-page: 1144 ident: b28 article-title: A general reinforcement learning algorithm that masters chess, shogi, and go through self-play publication-title: Science – volume: 7 start-page: 39974 year: 2019 end-page: 39982 ident: b11 article-title: Multi-objective workflow scheduling with deep-Q-network-based multi-agent reinforcement learning publication-title: IEEE Access – start-page: 485 year: 2016 end-page: 493 ident: b29 article-title: State of the art control of atari games using shallow reinforcement learning publication-title: AAMAS – start-page: 530 year: 2017 end-page: 540 ident: b17 article-title: Trade-off between prediction accuracy and underestimation rate in job runtime estimates publication-title: 2017 IEEE International Conference on Cluster Computing – reference: D. Carastan-Santos, R.Y. De Camargo, Obtaining dynamic scheduling policies with simulation and machine learning, in: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2017, pp. 1–13. – volume: 112 start-page: 181 year: 1999 end-page: 211 ident: b36 article-title: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning publication-title: Artificial Intelligence – year: 2017 ident: b38 article-title: Proximal policy optimization algorithms – start-page: 1 year: 2005 end-page: 35 ident: b35 article-title: Modeling user runtime estimates publication-title: Workshop on Job Scheduling Strategies for Parallel Processing – reference: H. Mao, M. Alizadeh, I. Menache, S. Kandula, Resource management with deep reinforcement learning, in: Proceedings of the 15th ACM Workshop on Hot Topics in Networks, 2016, pp. 50–56. – volume: 51 start-page: 1 year: 2018 end-page: 29 ident: b2 article-title: HPC cloud for scientific and business applications: Taxonomy, vision, and research challenges publication-title: ACM Comput. Surv. – volume: 81 start-page: 94 year: 2019 end-page: 106 ident: b12 article-title: Reinforcement learning based scheduling in a workflow management system publication-title: Eng. Appl. Artif. Intell. – year: 2021 ident: b3 article-title: Deep reinforcement agent for scheduling in HPC – start-page: 828 year: 2011 end-page: 839 ident: b39 article-title: Reducing fragmentation on torus-connected supercomputers publication-title: 2011 IEEE International Parallel & Distributed Processing Symposium – year: 2020 ident: b6 article-title: A closer look at invalid action masking in policy gradient algorithms – reference: H. Mao, M. Schwarzkopf, S.B. Venkatakrishnan, Z. Meng, M. Alizadeh, Learning scheduling algorithms for data processing clusters, in: Proceedings of the ACM Special Interest Group on Data Communication, 2019, pp. 270–288. – start-page: 1 year: 1996 end-page: 26 ident: b25 article-title: Toward convergence in job schedulers for parallel supercomputers publication-title: Workshop on Job Scheduling Strategies for Parallel Processing – volume: 63 start-page: 1105 year: 2003 end-page: 1122 ident: b30 article-title: The workload on parallel supercomputers: Modeling the characteristics of rigid jobs publication-title: J. Parallel Distrib. Comput. – volume: 12 start-page: 529 year: 2001 end-page: 543 ident: b33 article-title: Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling publication-title: IEEE Trans. Parallel Distrib. Syst. – volume: 14 start-page: 30 year: 2000 end-page: 37 ident: b21 article-title: A workload characterization study of the 1998 world cup web site publication-title: IEEE Netw. – year: 2016 ident: b22 article-title: Openai gym – start-page: 422 year: 2019 end-page: 429 ident: b4 article-title: Learning everywhere: Pervasive machine learning for effective high-performance computation publication-title: 2019 IEEE International Parallel and Distributed Processing Symposium Workshops – volume: 44 start-page: 455 year: 2014 end-page: 466 ident: b41 article-title: Multi-resource packing for cluster schedulers publication-title: ACM SIGCOMM Comput. Commun. Rev. – volume: 64 start-page: 1007 year: 2004 end-page: 1016 ident: b19 article-title: Predicting application run times with historical information publication-title: J. Parallel Distrib. Comput. – reference: G. Domeniconi, E.K. Lee, A. Morari, CuSH: Cognitive ScHeduler for Heterogeneous High Performance Computing System, in: Proceedings of DRL4KDD 19: Workshop on Deep Reinforcement Learning for Knowledge Discovery, DRL4KDD, 2019. – ident: 10.1016/j.future.2022.09.025_b8 doi: 10.1109/MASCOTS50786.2020.9285940 – start-page: 103 year: 2002 ident: 10.1016/j.future.2022.09.025_b34 article-title: The impact of more accurate requested runtimes on production job scheduling performance – volume: 30 year: 2017 ident: 10.1016/j.future.2022.09.025_b43 article-title: Attention is all you need publication-title: Adv. Neural Inf. Process. Syst. – start-page: 530 year: 2017 ident: 10.1016/j.future.2022.09.025_b17 article-title: Trade-off between prediction accuracy and underestimation rate in job runtime estimates – ident: 10.1016/j.future.2022.09.025_b5 doi: 10.1145/3005745.3005750 – start-page: 485 year: 2016 ident: 10.1016/j.future.2022.09.025_b29 article-title: State of the art control of atari games using shallow reinforcement learning – volume: 14 start-page: 30 issue: 3 year: 2000 ident: 10.1016/j.future.2022.09.025_b21 article-title: A workload characterization study of the 1998 world cup web site publication-title: IEEE Netw. doi: 10.1109/65.844498 – year: 2019 ident: 10.1016/j.future.2022.09.025_b15 article-title: Exploiting multi-resource scheduling for HPC publication-title: SC Poster – start-page: 253 year: 2004 ident: 10.1016/j.future.2022.09.025_b31 article-title: Are user runtime estimates inherently inaccurate? – ident: 10.1016/j.future.2022.09.025_b7 – volume: 6 start-page: 215 issue: 2 year: 1994 ident: 10.1016/j.future.2022.09.025_b27 article-title: TD-Gammon, a self-teaching backgammon program, achieves master-level play publication-title: Neural Comput. doi: 10.1162/neco.1994.6.2.215 – volume: 67 start-page: 35 year: 2017 ident: 10.1016/j.future.2022.09.025_b1 article-title: Job placement advisor based on turnaround predictions for HPC hybrid clouds publication-title: Future Gener. Comput. Syst. doi: 10.1016/j.future.2016.08.010 – start-page: 422 year: 2019 ident: 10.1016/j.future.2022.09.025_b4 article-title: Learning everywhere: Pervasive machine learning for effective high-performance computation – start-page: 6 year: 2016 ident: 10.1016/j.future.2022.09.025_b16 article-title: Helping HPC users specify job memory requirements via machine learning – start-page: 1 year: 1996 ident: 10.1016/j.future.2022.09.025_b25 article-title: Toward convergence in job schedulers for parallel supercomputers – ident: 10.1016/j.future.2022.09.025_b14 doi: 10.1145/3126908.3126955 – start-page: 202 year: 1999 ident: 10.1016/j.future.2022.09.025_b18 article-title: Using run-time predictions to estimate queue wait times and improve scheduler performance – volume: 362 start-page: 1140 issue: 6419 year: 2018 ident: 10.1016/j.future.2022.09.025_b28 article-title: A general reinforcement learning algorithm that masters chess, shogi, and go through self-play publication-title: Science doi: 10.1126/science.aar6404 – volume: 7 start-page: 39974 year: 2019 ident: 10.1016/j.future.2022.09.025_b11 article-title: Multi-objective workflow scheduling with deep-Q-network-based multi-agent reinforcement learning publication-title: IEEE Access doi: 10.1109/ACCESS.2019.2902846 – volume: 81 start-page: 94 year: 2019 ident: 10.1016/j.future.2022.09.025_b12 article-title: Reinforcement learning based scheduling in a workflow management system publication-title: Eng. Appl. Artif. Intell. doi: 10.1016/j.engappai.2019.02.013 – volume: 18 start-page: 1 year: 2018 ident: 10.1016/j.future.2022.09.025_b37 article-title: Automatic differentiation in machine learning: A survey publication-title: J. Mach. Learn. Res. – year: 2020 ident: 10.1016/j.future.2022.09.025_b6 – start-page: 188 year: 2001 ident: 10.1016/j.future.2022.09.025_b26 article-title: Metrics for parallel job scheduling and their convergence – ident: 10.1016/j.future.2022.09.025_b42 – volume: 44 start-page: 455 issue: 4 year: 2014 ident: 10.1016/j.future.2022.09.025_b41 article-title: Multi-resource packing for cluster schedulers publication-title: ACM SIGCOMM Comput. Commun. Rev. doi: 10.1145/2740070.2626334 – start-page: 1 year: 2020 ident: 10.1016/j.future.2022.09.025_b9 article-title: RLScheduler: An automated HPC batch job scheduler using reinforcement learning – year: 2016 ident: 10.1016/j.future.2022.09.025_b22 – year: 2022 ident: 10.1016/j.future.2022.09.025_b20 article-title: EsDNN: Deep neural network based multivariate workload prediction in cloud computing environments publication-title: ACM Trans. Internet Technol. doi: 10.1145/3524114 – year: 2021 ident: 10.1016/j.future.2022.09.025_b23 article-title: Stable-Baselines3: Reliable reinforcement learning implementations publication-title: J. Mach. Learn. Res. – volume: 51 start-page: 1 issue: 1 year: 2018 ident: 10.1016/j.future.2022.09.025_b2 article-title: HPC cloud for scientific and business applications: Taxonomy, vision, and research challenges publication-title: ACM Comput. Surv. doi: 10.1145/3150224 – volume: 64 start-page: 1007 issue: 9 year: 2004 ident: 10.1016/j.future.2022.09.025_b19 article-title: Predicting application run times with historical information publication-title: J. Parallel Distrib. Comput. doi: 10.1016/j.jpdc.2004.06.008 – year: 2021 ident: 10.1016/j.future.2022.09.025_b3 – start-page: 236 year: 1999 ident: 10.1016/j.future.2022.09.025_b32 article-title: Job-length estimation and performance in backfilling schedulers – volume: 112 start-page: 181 issue: 1 year: 1999 ident: 10.1016/j.future.2022.09.025_b36 article-title: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning publication-title: Artificial Intelligence doi: 10.1016/S0004-3702(99)00052-1 – volume: 63 start-page: 1105 issue: 11 year: 2003 ident: 10.1016/j.future.2022.09.025_b30 article-title: The workload on parallel supercomputers: Modeling the characteristics of rigid jobs publication-title: J. Parallel Distrib. Comput. doi: 10.1016/S0743-7315(03)00108-4 – volume: 12 start-page: 529 issue: 6 year: 2001 ident: 10.1016/j.future.2022.09.025_b33 article-title: Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling publication-title: IEEE Trans. Parallel Distrib. Syst. doi: 10.1109/71.932708 – ident: 10.1016/j.future.2022.09.025_b10 doi: 10.1145/3341302.3342080 – start-page: 303 year: 2022 ident: 10.1016/j.future.2022.09.025_b13 article-title: Reinforcement learning-based auto-scaling algorithm for elastic cloud workflow service – start-page: 828 year: 2011 ident: 10.1016/j.future.2022.09.025_b39 article-title: Reducing fragmentation on torus-connected supercomputers – year: 2017 ident: 10.1016/j.future.2022.09.025_b38 – start-page: 27 year: 1996 ident: 10.1016/j.future.2022.09.025_b40 article-title: Workload evolution on the Cornell theory center IBM SP2 – start-page: 79 year: 2021 ident: 10.1016/j.future.2022.09.025_b24 article-title: On the impact of MDP design for reinforcement learning agents in resource management – start-page: 1 year: 2005 ident: 10.1016/j.future.2022.09.025_b35 article-title: Modeling user runtime estimates  | 
    
| SSID | ssj0001731 | 
    
| Score | 2.3853617 | 
    
| Snippet | Deep reinforcement learning applied to computing systems has shown potential for improving system performance, as well as faster discovery of better allocation... | 
    
| SourceID | crossref elsevier  | 
    
| SourceType | Enrichment Source Index Database Publisher  | 
    
| StartPage | 239 | 
    
| SubjectTerms | Deep Reinforcement Learning Machine Learning Scheduling Semi-Markov Decision Processes Simulation Workload traces  | 
    
| Title | An SMDP approach for Reinforcement Learning in HPC cluster schedulers | 
    
| URI | https://dx.doi.org/10.1016/j.future.2022.09.025 | 
    
| Volume | 139 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier) customDbUrl: eissn: 1872-7115 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0001731 issn: 0167-739X databaseCode: GBLVA dateStart: 20110101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: Elsevier SD Complete Freedom Collection [SCCMFC] customDbUrl: eissn: 1872-7115 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0001731 issn: 0167-739X databaseCode: ACRLP dateStart: 19950201 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection customDbUrl: eissn: 1872-7115 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0001731 issn: 0167-739X databaseCode: .~1 dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: ScienceDirect Journal Collection customDbUrl: eissn: 1872-7115 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0001731 issn: 0167-739X databaseCode: AIKHN dateStart: 19950201 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier  | 
    
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1bS8MwFA5jvvjiXbyOPPga1yZpsz6OuVGVjeEc7C3kNpmMOmR79bd70qZeQBR8KyWBcsj5zpf05PsQutLMMmETQyLNOOGQi0S72BJ_3JABAZinkb-cPByl-ZTfzZJZA_XquzC-rTJgf4XpJVqHN-0QzfZqsWhPfAO9YNmM0vJ_nt-3cy68i8H122ebRyyCJyEAgh9dX58re7wq3Q7YJVJaqp16w-yfytOXkjPYQzuBK-Ju9Tn7qOGKA7Rb-zDgkJaHqN8t8GR4M8a1QDgGJoofXCmKasrzPxx0VJ_wosD5uIfNcuMlEjBsbqHYLIEEHqHpoP_Yy0mwRyCGsnRNdKo504qnRmhPBBTvqMQoBzkMKEYVBZxTGTyoNBHMiE4cKaEZo4nJrGUxO0bN4qVwJwjPoSjZyLgM6AGfAylhVpnEscRaM7dcnCJWR0WaoB3uLSyWsm4Se5ZVLKWPpYwyCbE8ReRj1qrSzvhjvKgDLr-tAQnw_uvMs3_PPEfb3kC-6sO-QM3168ZdAs1Y61a5jlpoq3t7n4_eAdxV0Vw | 
    
| linkProvider | Elsevier | 
    
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEF6KHvTiW6zPPXiNTfaRbY6ltlRtS7Et9LbsI5FKiUXaq7_d2WTjA0TBWwg7EIadb77ZzH6D0LWmlgrLTRBqygIGsRjoNLKBO25IgABkceguJw-GcW_K7md8VkPt6i6Ma6v02F9ieoHW_k3De7OxnM8bY9dAL2gyI6T4nwd1-ybjRLgK7Obts88jEn4oISCCW17dnyuavErhDigTCSnkTt3E7J_y05ec091DO54s4lb5PfuoluYHaLcaxIB9XB6iTivH48HtCFcK4RioKH5MC1VUUxwAYi-k-oTnOe6N2tgs1k4jAUN1C9lmASzwCE27nUm7F_j5CIEhNF4FOtaMasViI7RjAoo1FTcqhSAGGCOKANCpBB5UzAU1ohmFSmhKCTeJtTSix2gjf8nTE4QzyEo2NGkC_IBlwEqoVYanlFtrMstEHdHKK9J48XA3w2Ihqy6xZ1n6UjpfyjCR4Ms6Cj6slqV4xh_rReVw-W0TSMD3Xy1P_215hbZ6k0Ff9u-GD2do202TL5uyz9HG6nWdXgDnWOnLYk-9A-GQ0vE | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+SMDP+approach+for+Reinforcement+Learning+in+HPC+cluster+schedulers&rft.jtitle=Future+generation+computer+systems&rft.au=de+Freitas+Cunha%2C+Renato+Luiz&rft.au=Chaimowicz%2C+Luiz&rft.date=2023-02-01&rft.pub=Elsevier+B.V&rft.issn=0167-739X&rft.eissn=1872-7115&rft.volume=139&rft.spage=239&rft.epage=252&rft_id=info:doi/10.1016%2Fj.future.2022.09.025&rft.externalDocID=S0167739X22003090 | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0167-739X&client=summon | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0167-739X&client=summon | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0167-739X&client=summon |