Navigating Exascale Operational Data Analytics: From Inundation to Insight

In this paper, we address the challenges in achieving sustainable data-driven efficiency by providing a detailed exploration of the end-to-end operational data analytics (ODA) framework that evolved through two generations of supercomputer systems at the Oak Ridge Leadership Computing Facility (OLCF...

Full description

Saved in:
Bibliographic Details
Published inSC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis pp. 1795 - 1804
Main Authors Shin, Woong, Osborne, Tim, Karimi, Ahmad Maroof, Palumbo, Rachel, May, Alex, Lester, Corwin, Hines, Jesse, Sattar, Naw Safrin, Huk, Leah, Simmerman, Scott, Brewer, Wesley, Miller, Jeffrey, Adamson, Ryan, Kuchar, Olga, Prout, Ryan, Wang, Feiyi, Atchley, Scott, Oral, Sarp
Format Conference Proceeding
LanguageEnglish
Published IEEE 17.11.2024
Subjects
Online AccessGet full text
DOI10.1109/SCW63240.2024.00226

Cover

Abstract In this paper, we address the challenges in achieving sustainable data-driven efficiency by providing a detailed exploration of the end-to-end operational data analytics (ODA) framework that evolved through two generations of supercomputer systems at the Oak Ridge Leadership Computing Facility (OLCF). This framework addresses large data streams ingested from heavily instrumented HPC environment that accumulates multi-terabytes per day. We outline the multifaceted data life cycle across HPC procurement, operations, and research & development, identifying key obstacles and design decisions that shape effective strategies in building and supporting data pipelines end-to-end. By sharing key insights and lessons learned from our experience, we offer recommendations for the HPC community on enabling sustainable operational data analytics and beyond. Our contributions aim to bridge the gap between potential and real benefits of operational data, guiding future efforts towards integrated and sustainable operational intelligence in high-performance computing environments.
AbstractList In this paper, we address the challenges in achieving sustainable data-driven efficiency by providing a detailed exploration of the end-to-end operational data analytics (ODA) framework that evolved through two generations of supercomputer systems at the Oak Ridge Leadership Computing Facility (OLCF). This framework addresses large data streams ingested from heavily instrumented HPC environment that accumulates multi-terabytes per day. We outline the multifaceted data life cycle across HPC procurement, operations, and research & development, identifying key obstacles and design decisions that shape effective strategies in building and supporting data pipelines end-to-end. By sharing key insights and lessons learned from our experience, we offer recommendations for the HPC community on enabling sustainable operational data analytics and beyond. Our contributions aim to bridge the gap between potential and real benefits of operational data, guiding future efforts towards integrated and sustainable operational intelligence in high-performance computing environments.
Author Atchley, Scott
Karimi, Ahmad Maroof
Palumbo, Rachel
Simmerman, Scott
May, Alex
Prout, Ryan
Wang, Feiyi
Adamson, Ryan
Brewer, Wesley
Osborne, Tim
Huk, Leah
Kuchar, Olga
Hines, Jesse
Sattar, Naw Safrin
Miller, Jeffrey
Shin, Woong
Oral, Sarp
Lester, Corwin
Author_xml – sequence: 1
  givenname: Woong
  surname: Shin
  fullname: Shin, Woong
  email: shinw@ornl.gov
  organization: Oak Ridge National Laboratory,Oak Ridge,TN
– sequence: 2
  givenname: Tim
  surname: Osborne
  fullname: Osborne, Tim
  email: osbornetd@ornl.gov
  organization: Oak Ridge National Laboratory,Oak Ridge,TN
– sequence: 3
  givenname: Ahmad Maroof
  surname: Karimi
  fullname: Karimi, Ahmad Maroof
  email: karimiahmad@ornl.gov
  organization: Oak Ridge National Laboratory,Oak Ridge,TN
– sequence: 4
  givenname: Rachel
  surname: Palumbo
  fullname: Palumbo, Rachel
  email: palumborl@ornl.gov
  organization: Oak Ridge National Laboratory,Oak Ridge,TN
– sequence: 5
  givenname: Alex
  surname: May
  fullname: May, Alex
  email: mayab@ornl.gov
  organization: Oak Ridge National Laboratory,Oak Ridge,TN
– sequence: 6
  givenname: Corwin
  surname: Lester
  fullname: Lester, Corwin
  email: lestercp@ornl.gov
  organization: Oak Ridge National Laboratory,Oak Ridge,TN
– sequence: 7
  givenname: Jesse
  surname: Hines
  fullname: Hines, Jesse
  email: hinesjr@ornl.gov
  organization: Oak Ridge National Laboratory,Oak Ridge,TN
– sequence: 8
  givenname: Naw Safrin
  surname: Sattar
  fullname: Sattar, Naw Safrin
  email: sattarn@ornl.gov
  organization: Oak Ridge National Laboratory,Oak Ridge,TN
– sequence: 9
  givenname: Leah
  surname: Huk
  fullname: Huk, Leah
  email: hukln@ornl.gov
  organization: Oak Ridge National Laboratory,Oak Ridge,TN
– sequence: 10
  givenname: Scott
  surname: Simmerman
  fullname: Simmerman, Scott
  email: simmermansg@ornl.gov
  organization: Oak Ridge National Laboratory,Oak Ridge,TN
– sequence: 11
  givenname: Wesley
  surname: Brewer
  fullname: Brewer, Wesley
  email: brewerwh@ornl.gov
  organization: Oak Ridge National Laboratory,Oak Ridge,TN
– sequence: 12
  givenname: Jeffrey
  surname: Miller
  fullname: Miller, Jeffrey
  email: millerjl@ornl.gov
  organization: Oak Ridge National Laboratory,Oak Ridge,TN
– sequence: 13
  givenname: Ryan
  surname: Adamson
  fullname: Adamson, Ryan
  email: adamsonrm@ornl.gov
  organization: Oak Ridge National Laboratory,Oak Ridge,TN
– sequence: 14
  givenname: Olga
  surname: Kuchar
  fullname: Kuchar, Olga
  email: kucharoa@ornl.gov
  organization: Oak Ridge National Laboratory,Oak Ridge,TN
– sequence: 15
  givenname: Ryan
  surname: Prout
  fullname: Prout, Ryan
  email: proutrc@ornl.gov
  organization: Oak Ridge National Laboratory,Oak Ridge,TN
– sequence: 16
  givenname: Feiyi
  surname: Wang
  fullname: Wang, Feiyi
  email: fwang2@ornl.gov
  organization: Oak Ridge National Laboratory,Oak Ridge,TN
– sequence: 17
  givenname: Scott
  surname: Atchley
  fullname: Atchley, Scott
  email: scott@ornl.gov
  organization: Oak Ridge National Laboratory,Oak Ridge,TN
– sequence: 18
  givenname: Sarp
  surname: Oral
  fullname: Oral, Sarp
  email: oralhs@ornl.gov
  organization: Oak Ridge National Laboratory,Oak Ridge,TN
BookMark eNpNkEFOwzAQRY0EElB6Alj4Ag1je-w47KrSQlFFF4BYRo7tFEupUyUppbcnbVmwmj-jeX_xrsl5rKMn5JZBwhhk92-TTyU4QsKBYwLAuTojwyzNtJAgpJQoLsmwbUMBCqRG0PKKvLya77AyXYgrOv0xrTWVp8uNb_pTHU1FH01n6LhP-y7Y9oHOmnpN53Eb3fGDdnW_tWH11d2Qi9JUrR_-zQH5mE3fJ8-jxfJpPhkvRkZw3Y0QTMGURaeQedTCy8ypUhQ29SgQnc7KAp1XpeZCOZs6KzTXTBfAU5tiJgYET73buDH7namqfNOEtWn2OYP8oCJv7e6oIj-oyI8qeuzuhAXv_T9Cc5CpFr81nF5m
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
ADTOC
UNPAY
DOI 10.1109/SCW63240.2024.00226
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE/IET Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
Unpaywall for CDI: Periodical Content
Unpaywall
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
– sequence: 2
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
EISBN 9798350355543
EndPage 1804
ExternalDocumentID oai:osti.gov:2538413
10820578
Genre orig-research
GrantInformation_xml – fundername: Oak Ridge National Laboratory
  funderid: 10.13039/100006228
– fundername: U.S. Department of Energy
  funderid: 10.13039/100000015
GroupedDBID 6IE
6IL
ACM
ALMA_UNASSIGNED_HOLDINGS
CBEJK
RIE
RIL
ADTOC
UNPAY
ID FETCH-LOGICAL-a328t-40ab16c4d641e483e59d6f3bc7e4344d89fb4de6f8236dc7dc382818b027c7493
IEDL.DBID RIE
IngestDate Sun Oct 26 04:17:49 EDT 2025
Wed Aug 27 01:59:34 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a328t-40ab16c4d641e483e59d6f3bc7e4344d89fb4de6f8236dc7dc382818b027c7493
OpenAccessLink https://proxy.k.utb.cz/login?url=http://www.osti.gov/servlets/purl/2538413
PageCount 10
ParticipantIDs ieee_primary_10820578
unpaywall_primary_10_1109_scw63240_2024_00226
PublicationCentury 2000
PublicationDate 2024-Nov.-17
PublicationDateYYYYMMDD 2024-11-17
PublicationDate_xml – month: 11
  year: 2024
  text: 2024-Nov.-17
  day: 17
PublicationDecade 2020
PublicationTitle SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis
PublicationTitleAbbrev SC-W
PublicationYear 2024
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib060584085
Score 1.8925397
Snippet In this paper, we address the challenges in achieving sustainable data-driven efficiency by providing a detailed exploration of the end-to-end operational data...
SourceID unpaywall
ieee
SourceType Open Access Repository
Publisher
StartPage 1795
SubjectTerms Complexity theory
Data analysis
Data Analytics
Data Governance
HPC Post-Exascale Challenges
Machine Learning Applications
Monitoring
Operational Data Analytics
Organizations
Pipelines
Software
Standards organizations
Streams
Stress
Subject matter experts
Sustainable development
Telemetry
Visual Analytics
SummonAdditionalLinks – databaseName: Unpaywall
  dbid: UNPAY
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlZ07T8MwEMct1A6IBRBFgAB5YE2bOq7jsCFoVSpROlCpTJFfYaAkUR4E-PT40rRUYkCMeTiy7mz57nT5_RG6ilzBCJfaEQGlDg04c2zczB3hcSkoCKkaqHc8TNl4TieLwWILFl1V3cSu7bqlEsqSdvJ5Ly2zZY_YrUlBn7bNBjbsbqH2fDq7eW5IQn036OWqAvi4a7M-AkxsAsSEWjNlD-2WcSo-K7Fcbh0fo_1V41VeUweha-S1Wxayq75-Mxn_ntkB6vz8pYdnm_PnEO2Y-AhNpuK9pmbEL3j4IXLrAoMfU5M1RT98JwqBaxQJAJqv8ShL3vB9vBZXwkVir3JI2TtoPho-3Y6dRi_BWpbwwqaCQvaZoprRvqHcM4NAs8iTyjfUo1TzIJJUGxaByLlWvlYeBxiUtKmp8mngHaNWnMTmBGGmJLFjmRJuRIUKJPE5k1py-xnpKn6KOmDXMF0hMQAzSmzkZ-87G0NvPQzBP-HaPyH4J6z9c_bP989Rq8hKc2EDgUJeNkvgG-X4tb4
  priority: 102
  providerName: Unpaywall
Title Navigating Exascale Operational Data Analytics: From Inundation to Insight
URI https://ieeexplore.ieee.org/document/10820578
http://www.osti.gov/servlets/purl/2538413
UnpaywallVersion submittedVersion
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEN4oHtSLGjHig-zBa7GPZbv1ZhCCJFYSJeKp2Vc9iIVAEfXXu7O0SExMvLVpu9nMN83szO73DUIXqcupz4RyeESIQyJGHbNuZg4PmOAEGqlqqHfcxbQ7IL1hc1iQ1S0XRmttD5_pBlzavXw1lnMolZk_3MQr42KbaDNkdEnWKp0HtvdAratQFvLc6PKh9QRi5K7JAn3QyPZBQcH2UNlF2_Nswj8XfDRaCyedPRSXE1meInltzHPRkF-_NBr_PdN9VP1h7uH-KiYdoA2dHaJezN-tkkb2gtsffGZg0fh-oqdFIRDf8JxjK08Cos1XuDMdv-HbrGy4hPOxuZtBGl9Fg077sdV1ih4Kxto-y016yIVHJVGUeJqwQDcjRdNAyFCTgBDFolQQpWkKjc-VDJUMGAhECZOuypBEwRGqZONMHyNMpfDNt1RyNyVcRsI3IAglmBlGuJLVUBUMkUyWMhlJaYMaclbGXnuYAEbJTC4sRglglFiMTv4Y5xTtwEtABPTCM1TJp3N9blYEuahbT6ijrUHcv37-BqHttgM
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjZ1LT8MwDIAjGAfgAogh3uTAtaNrvTTlCkxjbAMJJrhVeZUD0E1by4BfT5x1Y0JC4taqShTZrhw78WdCTlNfsIBL7YkYwIOYM8_um7knQi4FYCNVg_mObo-1-tB-ajyVxequFsYY4y6fmRo-urN8PVAFpsrsH279lTWxZbLSAIDGtFxrZj54wIe8rpItVPfjs_uLR8SR-zYODJCSHSBDwXVRWSerRTYUnxPx-rrgUJobpDdbyvQeyUutyGVNff2iNP57rZuk-lO7R-_mXmmLLJlsm7R74t2xNLJnevUhxlYxht4OzahMBdJLkQvqACWIbT6nzdHgjV5ns5ZLNB_YtzEG8lXSb149XLS8souClXfAcxsgCllnCjSDugEemkasWRpKFRkIATSPUwnasBRbn2sVaRVyRERJG7CqCOJwh1SyQWZ2CWVKBnYsU8JPQahYBhFnUktup5G-4nukioJIhlNQRjKTwR7x5sJe-JigjpKxmjgdJaijxOlo_495Tshq66HbSTrXvZsDsoYDsCywHh2SSj4qzJHdH-Ty2FnFN5hgt6A
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlZ07T8MwEMct1A6IBRBFgAB5YE2bOq7jsCFoVSpROlCpTJFfYaAkUR4E-PT40rRUYkCMeTiy7mz57nT5_RG6ilzBCJfaEQGlDg04c2zczB3hcSkoCKkaqHc8TNl4TieLwWILFl1V3cSu7bqlEsqSdvJ5Ly2zZY_YrUlBn7bNBjbsbqH2fDq7eW5IQn036OWqAvi4a7M-AkxsAsSEWjNlD-2WcSo-K7Fcbh0fo_1V41VeUweha-S1Wxayq75-Mxn_ntkB6vz8pYdnm_PnEO2Y-AhNpuK9pmbEL3j4IXLrAoMfU5M1RT98JwqBaxQJAJqv8ShL3vB9vBZXwkVir3JI2TtoPho-3Y6dRi_BWpbwwqaCQvaZoprRvqHcM4NAs8iTyjfUo1TzIJJUGxaByLlWvlYeBxiUtKmp8mngHaNWnMTmBGGmJLFjmRJuRIUKJPE5k1py-xnpKn6KOmDXMF0hMQAzSmzkZ-87G0NvPQzBP-HaPyH4J6z9c_bP989Rq8hKc2EDgUJeNkvgG-X4tb4
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=SC24-W%3A+Workshops+of+the+International+Conference+for+High+Performance+Computing%2C+Networking%2C+Storage+and+Analysis&rft.atitle=Navigating+Exascale+Operational+Data+Analytics%3A+From+Inundation+to+Insight&rft.au=Shin%2C+Woong&rft.au=Osborne%2C+Tim&rft.au=Karimi%2C+Ahmad+Maroof&rft.au=Palumbo%2C+Rachel&rft.date=2024-11-17&rft.pub=IEEE&rft.spage=1795&rft.epage=1804&rft_id=info:doi/10.1109%2FSCW63240.2024.00226&rft.externalDocID=10820578