Supporting Real-Time Jobs on the IBM Blue Gene/Q: Simulation-Based Study

As the volume and velocity of data generated by scientific experiments increase, the analysis of those data inevitably requires HPC resources. Successful research in a growing number of scientific fields depends on the ability to analyze data rapidly. In many situations, scientists and engineers wan...

Full description

Saved in:
Bibliographic Details
Published inJob Scheduling Strategies for Parallel Processing Vol. 10773; pp. 83 - 102
Main Authors Wang, Daihou, Jung, Eun-Sung, Kettimuthu, Rajkumar, Foster, Ian, Foran, David J., Parashar, Manish
Format Book Chapter
LanguageEnglish
Published Switzerland Springer International Publishing AG 01.01.2018
Springer International Publishing
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text
ISBN9783319773971
3319773976
ISSN0302-9743
1611-3349
DOI10.1007/978-3-319-77398-8_5

Cover

Abstract As the volume and velocity of data generated by scientific experiments increase, the analysis of those data inevitably requires HPC resources. Successful research in a growing number of scientific fields depends on the ability to analyze data rapidly. In many situations, scientists and engineers want quasi-instant feedback, so that results from one experiment can guide selection of the next or even improve the course of a single experiment. Such real-time requirements are hard to meet on current HPC systems, which are typically batch-scheduled under policies in which an arriving job is run immediately only if enough resources are available and is otherwise queued. Real-time jobs, in order to meet their requirements, should sometimes have higher priority than batch jobs that were submitted earlier. But, accommodating more real-time jobs will negatively impact the performance of batch jobs, which may have to be preempted. The overhead involved in preempting and restarting batch jobs will, in turn, negatively impact system utilization. Here we evaluate various scheduling schemes to support real-time jobs along with the traditional batch jobs. We perform simulation studies using trace logs of Mira, the IBM BG/Q system at Argonne National Laboratory, to quantify the impact of real-time jobs on batch job performance for various percentages of real-time jobs in the workload. We present new insights gained from grouping the jobs into different categories and studying the performance of each category. Our results show that real-time jobs in all categories can achieve an average slowdown less than 1.5 and that most categories achieve an average slowdown close to 1 with at most 20% increase in average slowdown for some categories of batch jobs with 20% or fewer real-time jobs.
AbstractList As the volume and velocity of data generated by scientific experiments increase, the analysis of those data inevitably requires HPC resources. Successful research in a growing number of scientific fields depends on the ability to analyze data rapidly. In many situations, scientists and engineers want quasi-instant feedback, so that results from one experiment can guide selection of the next or even improve the course of a single experiment. Such real-time requirements are hard to meet on current HPC systems, which are typically batch-scheduled under policies in which an arriving job is run immediately only if enough resources are available and is otherwise queued. Real-time jobs, in order to meet their requirements, should sometimes have higher priority than batch jobs that were submitted earlier. But, accommodating more real-time jobs will negatively impact the performance of batch jobs, which may have to be preempted. The overhead involved in preempting and restarting batch jobs will, in turn, negatively impact system utilization. Here we evaluate various scheduling schemes to support real-time jobs along with the traditional batch jobs. We perform simulation studies using trace logs of Mira, the IBM BG/Q system at Argonne National Laboratory, to quantify the impact of real-time jobs on batch job performance for various percentages of real-time jobs in the workload. We present new insights gained from grouping the jobs into different categories and studying the performance of each category. Our results show that real-time jobs in all categories can achieve an average slowdown less than 1.5 and that most categories achieve an average slowdown close to 1 with at most 20% increase in average slowdown for some categories of batch jobs with 20% or fewer real-time jobs.
Author Parashar, Manish
Wang, Daihou
Foster, Ian
Foran, David J.
Kettimuthu, Rajkumar
Jung, Eun-Sung
Author_xml – sequence: 1
  givenname: Daihou
  surname: Wang
  fullname: Wang, Daihou
  organization: Rutgers Discovery Informatics Institute, Rutgers University, Piscataway, USA
– sequence: 2
  givenname: Eun-Sung
  orcidid: 0000-0002-1288-7521
  surname: Jung
  fullname: Jung, Eun-Sung
  email: ejung@hongik.ac.kr
  organization: Hongik University, Seoul, South Korea
– sequence: 3
  givenname: Rajkumar
  surname: Kettimuthu
  fullname: Kettimuthu, Rajkumar
  organization: MCS Division, Argonne National Laboratory, Lemont, USA
– sequence: 4
  givenname: Ian
  surname: Foster
  fullname: Foster, Ian
  organization: Department of Computer Science, University of Chicago, Chicago, USA
– sequence: 5
  givenname: David J.
  surname: Foran
  fullname: Foran, David J.
  organization: Rutgers Cancer Institute of New Jersey, New Brunswick, USA
– sequence: 6
  givenname: Manish
  surname: Parashar
  fullname: Parashar, Manish
  organization: Rutgers Discovery Informatics Institute, Rutgers University, Piscataway, USA
BookMark eNqNkMtOwzAQRc1TFOgXsPEPmI7fNjta8VQRgsLaSu0pFEIS4mTB3xMeG3ZII410r85odPbJdlVXSMgRh2MOYCfeOiaZ5J5ZK71jLugNMh5SOWTfkdskI244Z1Iqv_Wns3ybjECCYN4quUv2OQhvQBnwe2Sc8wsAcO-MF2ZELhd909Rtt66e6D0WJXtYvyG9rpeZ1hXtnpFeTW_otOyRXmCFk7sTuli_9WXRreuKTYuMiS66Pn0ckp1VUWYc_-4D8nh-9jC7ZPPbi6vZ6Zw9KRAdi9ajLQClsg5i0sZFL3Clkk5xaVZSq6hS1EVSXOmVlFZ7rZJMEaUTS-PkAeE_d3PTDk9jG5Z1_ZoDh_ClLgwiggyDivDtKQzqBkb8ME1bv_eYu4BfUMSqa4syPhdNh20ORjglwAcvhvkvpLU3CuAX-gR5cH3g
ContentType Book Chapter
Copyright Springer International Publishing AG, part of Springer Nature 2018
Copyright_xml – notice: Springer International Publishing AG, part of Springer Nature 2018
DBID FFUUA
DEWEY 4.3499999999999996
DOI 10.1007/978-3-319-77398-8_5
DatabaseName ProQuest Ebook Central - Book Chapters - Demo use only
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9783319773988
3319773984
EISSN 1611-3349
Editor Cirne, Walfredo
Klusáček, Dalibor
Desai, Narayan
Editor_xml – sequence: 1
  fullname: Klusáček, Dalibor
– sequence: 2
  fullname: Cirne, Walfredo
– sequence: 3
  fullname: Desai, Narayan
EndPage 102
ExternalDocumentID EBC6284209_92_92
EBC5596400_92_92
GroupedDBID 0D6
0DA
38.
AABBV
ABFTD
ABPUQ
ADIEE
AEDXK
AEJLV
AEKFX
AEZAY
ALMA_UNASSIGNED_HOLDINGS
ANXHU
AZZ
BBABE
BICGV
BJAWL
BUBNW
CVGDX
CZZ
EDOXC
FFUUA
FOYMO
I4C
IEZ
NQNQZ
OEBZI
SBO
TPJZQ
TSXQS
Z5O
Z7R
Z7S
Z7U
Z7W
Z7X
Z7Y
Z7Z
Z81
Z83
Z84
Z85
Z87
Z88
-DT
-~X
29L
2HA
2HV
ACGFS
ADCXD
EJD
F5P
LAS
LDH
P2P
RSU
~02
ID FETCH-LOGICAL-g402t-c79e7a0e34780cd568c92ef4d5dcb6f354c4dc5ad4145f3375954d3dce382b683
ISBN 9783319773971
3319773976
ISSN 0302-9743
IngestDate Tue Jul 29 20:10:53 EDT 2025
Thu May 29 00:53:33 EDT 2025
Thu May 29 01:06:23 EDT 2025
IsPeerReviewed true
IsScholarly true
LCCallNum QA76.758QA76.9.A73QA
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-g402t-c79e7a0e34780cd568c92ef4d5dcb6f354c4dc5ad4145f3375954d3dce382b683
OCLC 1029604609
ORCID 0000-0002-1288-7521
PQID EBC5596400_92_92
PageCount 20
ParticipantIDs springer_books_10_1007_978_3_319_77398_8_5
proquest_ebookcentralchapters_6284209_92_92
proquest_ebookcentralchapters_5596400_92_92
PublicationCentury 2000
PublicationDate 2018-01-01
PublicationDateYYYYMMDD 2018-01-01
PublicationDate_xml – month: 01
  year: 2018
  text: 2018-01-01
  day: 01
PublicationDecade 2010
PublicationPlace Switzerland
PublicationPlace_xml – name: Switzerland
– name: Cham
PublicationSeriesSubtitle Theoretical Computer Science and General Issues
PublicationSeriesTitle Lecture Notes in Computer Science
PublicationSeriesTitleAlternate Lect.Notes Computer
PublicationSubtitle 21st International Workshop, JSSPP 2017, Orlando, FL, USA, June 2, 2017, Revised Selected Papers
PublicationTitle Job Scheduling Strategies for Parallel Processing
PublicationYear 2018
Publisher Springer International Publishing AG
Springer International Publishing
Publisher_xml – name: Springer International Publishing AG
– name: Springer International Publishing
RelatedPersons Kleinberg, Jon M.
Mattern, Friedemann
Naor, Moni
Mitchell, John C.
Terzopoulos, Demetri
Steffen, Bernhard
Pandu Rangan, C.
Kanade, Takeo
Kittler, Josef
Weikum, Gerhard
Hutchison, David
Tygar, Doug
RelatedPersons_xml – sequence: 1
  givenname: David
  surname: Hutchison
  fullname: Hutchison, David
  organization: Lancaster University, Lancaster, United Kingdom
– sequence: 2
  givenname: Takeo
  surname: Kanade
  fullname: Kanade, Takeo
  organization: Carnegie Mellon University, Pittsburgh, USA
– sequence: 3
  givenname: Josef
  surname: Kittler
  fullname: Kittler, Josef
  organization: University of Surrey, Guildford, United Kingdom
– sequence: 4
  givenname: Jon M.
  surname: Kleinberg
  fullname: Kleinberg, Jon M.
  organization: Cornell University, Ithaca, USA
– sequence: 5
  givenname: Friedemann
  surname: Mattern
  fullname: Mattern, Friedemann
  organization: ETH Zurich, Zurich, Switzerland
– sequence: 6
  givenname: John C.
  surname: Mitchell
  fullname: Mitchell, John C.
  organization: Stanford University, Stanford, USA
– sequence: 7
  givenname: Moni
  surname: Naor
  fullname: Naor, Moni
  organization: Weizmann Institute of Science, Rehovot, Israel
– sequence: 8
  givenname: C.
  surname: Pandu Rangan
  fullname: Pandu Rangan, C.
  organization: Indian Institute of Technology, Chennai, India
– sequence: 9
  givenname: Bernhard
  surname: Steffen
  fullname: Steffen, Bernhard
  organization: TU Dortmund University, Dortmund, Germany
– sequence: 10
  givenname: Demetri
  surname: Terzopoulos
  fullname: Terzopoulos, Demetri
  organization: University of California, Los Angeles, USA
– sequence: 11
  givenname: Doug
  surname: Tygar
  fullname: Tygar, Doug
  organization: University of California, Berkeley, USA
– sequence: 12
  givenname: Gerhard
  surname: Weikum
  fullname: Weikum, Gerhard
  organization: Max Planck Institute for Informatics, Saarbrücken, Germany
SSID ssj0001986926
ssj0002792
Score 2.1218517
Snippet As the volume and velocity of data generated by scientific experiments increase, the analysis of those data inevitably requires HPC resources. Successful...
SourceID springer
proquest
SourceType Publisher
StartPage 83
SubjectTerms Preemptive scheduling
Real-time job scheduling
Scheduler simulation
Supercomputing
Title Supporting Real-Time Jobs on the IBM Blue Gene/Q: Simulation-Based Study
URI http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=5596400&ppg=92
http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=6284209&ppg=92
http://link.springer.com/10.1007/978-3-319-77398-8_5
Volume 10773
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9QwELa2ywVx4C3e8oETq0ASPxIjcWBh0ba0FVVb1JuV2A4tLFnUTS78In4mYzve7AYOlEsURVbszOfMjMffjBF6XiWsNGAGIrvpE9GqYFGRKUubyuK4VNoox6o8OOTzU7p3xs5Go18brKW2KV-qn3_NK_kfVOEZ4GqzZK-A7Pql8ADuAV-4AsJwHTi_22HWzpUsbQ1NMBVdVe1Q9MExBz8Vl_aUlEXIBAgWysXO_f_9vrg4X7ZrCk3318_aOjpu-9YfTdNcfG-b89ahUXz9ZknZvfEKJ3vshmkWMkx-WM_ek_uKRWRTTSYw5LA7MdmdHkymi9YXvgYRHNnYxDH05Ml50RTMq3Y0R6-srEDN6s1-t-dxuGwclWwSjqUIWmozjJHkgzBGCGMOAqF9LG5r3UtAcWQZuFLJhrokoNthdeTVpfHqnNsijcQXRe1UdE42jH3i0r3_tCOb1BGb5mU7y6Ncsh20A92P0bW3s739z300T-RcWEe28wFsWUa_f-XHZLOKwpi5r_vUf8O6GJavdzzocWvpM9itd07QyS10wybGYJuxAsK7jUamvoNuBvnjTv530bzHHq-xxxZ7vKwxYI8Be2yxxxb7V0ev8RB37HC_h04_zE7ezaPuwI7oC43TJlKZMFkRG0KzPFaa8VyJ1FRUM61KXhFGFdWKFZomlFWEZEwwqolWhuRpyXNyH43rZW0eIKw5oXlMK15UCc0oLbUwogJ_MlWUC5E-RJMgFeloBR2XWXkZrCSslDnYJylS-Q-tOXhpaSxC6xdBzNI2XslQ2xvgkUQCPNLBIwGeR1dp_Bhd7-f-EzRuLlvzFJzapnzWzajf_xiZGQ
linkProvider Library Specific Holdings
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.title=Job+Scheduling+Strategies+for+Parallel+Processing&rft.au=Wang%2C+Daihou&rft.au=Jung%2C+Eun-Sung&rft.au=Kettimuthu%2C+Rajkumar&rft.au=Foster%2C+Ian&rft.atitle=Supporting+Real-Time+Jobs+on+the+IBM+Blue+Gene%2FQ%3A+Simulation-Based+Study&rft.series=Lecture+Notes+in+Computer+Science&rft.date=2018-01-01&rft.pub=Springer+International+Publishing&rft.isbn=9783319773971&rft.issn=0302-9743&rft.eissn=1611-3349&rft.spage=83&rft.epage=102&rft_id=info:doi/10.1007%2F978-3-319-77398-8_5
thumbnail_s http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F5596400-l.jpg
http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F6284209-l.jpg