Navigating Exascale Operational Data Analytics: From Inundation to Insight
In this paper, we address the challenges in achieving sustainable data-driven efficiency by providing a detailed exploration of the end-to-end operational data analytics (ODA) framework that evolved through two generations of supercomputer systems at the Oak Ridge Leadership Computing Facility (OLCF...
        Saved in:
      
    
          | Published in | SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis pp. 1795 - 1804 | 
|---|---|
| Main Authors | , , , , , , , , , , , , , , , , , | 
| Format | Conference Proceeding | 
| Language | English | 
| Published | 
            IEEE
    
        17.11.2024
     | 
| Subjects | |
| Online Access | Get full text | 
| DOI | 10.1109/SCW63240.2024.00226 | 
Cover
| Abstract | In this paper, we address the challenges in achieving sustainable data-driven efficiency by providing a detailed exploration of the end-to-end operational data analytics (ODA) framework that evolved through two generations of supercomputer systems at the Oak Ridge Leadership Computing Facility (OLCF). This framework addresses large data streams ingested from heavily instrumented HPC environment that accumulates multi-terabytes per day. We outline the multifaceted data life cycle across HPC procurement, operations, and research & development, identifying key obstacles and design decisions that shape effective strategies in building and supporting data pipelines end-to-end. By sharing key insights and lessons learned from our experience, we offer recommendations for the HPC community on enabling sustainable operational data analytics and beyond. Our contributions aim to bridge the gap between potential and real benefits of operational data, guiding future efforts towards integrated and sustainable operational intelligence in high-performance computing environments. | 
    
|---|---|
| AbstractList | In this paper, we address the challenges in achieving sustainable data-driven efficiency by providing a detailed exploration of the end-to-end operational data analytics (ODA) framework that evolved through two generations of supercomputer systems at the Oak Ridge Leadership Computing Facility (OLCF). This framework addresses large data streams ingested from heavily instrumented HPC environment that accumulates multi-terabytes per day. We outline the multifaceted data life cycle across HPC procurement, operations, and research & development, identifying key obstacles and design decisions that shape effective strategies in building and supporting data pipelines end-to-end. By sharing key insights and lessons learned from our experience, we offer recommendations for the HPC community on enabling sustainable operational data analytics and beyond. Our contributions aim to bridge the gap between potential and real benefits of operational data, guiding future efforts towards integrated and sustainable operational intelligence in high-performance computing environments. | 
    
| Author | Atchley, Scott Karimi, Ahmad Maroof Palumbo, Rachel Simmerman, Scott May, Alex Prout, Ryan Wang, Feiyi Adamson, Ryan Brewer, Wesley Osborne, Tim Huk, Leah Kuchar, Olga Hines, Jesse Sattar, Naw Safrin Miller, Jeffrey Shin, Woong Oral, Sarp Lester, Corwin  | 
    
| Author_xml | – sequence: 1 givenname: Woong surname: Shin fullname: Shin, Woong email: shinw@ornl.gov organization: Oak Ridge National Laboratory,Oak Ridge,TN – sequence: 2 givenname: Tim surname: Osborne fullname: Osborne, Tim email: osbornetd@ornl.gov organization: Oak Ridge National Laboratory,Oak Ridge,TN – sequence: 3 givenname: Ahmad Maroof surname: Karimi fullname: Karimi, Ahmad Maroof email: karimiahmad@ornl.gov organization: Oak Ridge National Laboratory,Oak Ridge,TN – sequence: 4 givenname: Rachel surname: Palumbo fullname: Palumbo, Rachel email: palumborl@ornl.gov organization: Oak Ridge National Laboratory,Oak Ridge,TN – sequence: 5 givenname: Alex surname: May fullname: May, Alex email: mayab@ornl.gov organization: Oak Ridge National Laboratory,Oak Ridge,TN – sequence: 6 givenname: Corwin surname: Lester fullname: Lester, Corwin email: lestercp@ornl.gov organization: Oak Ridge National Laboratory,Oak Ridge,TN – sequence: 7 givenname: Jesse surname: Hines fullname: Hines, Jesse email: hinesjr@ornl.gov organization: Oak Ridge National Laboratory,Oak Ridge,TN – sequence: 8 givenname: Naw Safrin surname: Sattar fullname: Sattar, Naw Safrin email: sattarn@ornl.gov organization: Oak Ridge National Laboratory,Oak Ridge,TN – sequence: 9 givenname: Leah surname: Huk fullname: Huk, Leah email: hukln@ornl.gov organization: Oak Ridge National Laboratory,Oak Ridge,TN – sequence: 10 givenname: Scott surname: Simmerman fullname: Simmerman, Scott email: simmermansg@ornl.gov organization: Oak Ridge National Laboratory,Oak Ridge,TN – sequence: 11 givenname: Wesley surname: Brewer fullname: Brewer, Wesley email: brewerwh@ornl.gov organization: Oak Ridge National Laboratory,Oak Ridge,TN – sequence: 12 givenname: Jeffrey surname: Miller fullname: Miller, Jeffrey email: millerjl@ornl.gov organization: Oak Ridge National Laboratory,Oak Ridge,TN – sequence: 13 givenname: Ryan surname: Adamson fullname: Adamson, Ryan email: adamsonrm@ornl.gov organization: Oak Ridge National Laboratory,Oak Ridge,TN – sequence: 14 givenname: Olga surname: Kuchar fullname: Kuchar, Olga email: kucharoa@ornl.gov organization: Oak Ridge National Laboratory,Oak Ridge,TN – sequence: 15 givenname: Ryan surname: Prout fullname: Prout, Ryan email: proutrc@ornl.gov organization: Oak Ridge National Laboratory,Oak Ridge,TN – sequence: 16 givenname: Feiyi surname: Wang fullname: Wang, Feiyi email: fwang2@ornl.gov organization: Oak Ridge National Laboratory,Oak Ridge,TN – sequence: 17 givenname: Scott surname: Atchley fullname: Atchley, Scott email: scott@ornl.gov organization: Oak Ridge National Laboratory,Oak Ridge,TN – sequence: 18 givenname: Sarp surname: Oral fullname: Oral, Sarp email: oralhs@ornl.gov organization: Oak Ridge National Laboratory,Oak Ridge,TN  | 
    
| BookMark | eNpNkEFOwzAQRY0EElB6Alj4Ag1je-w47KrSQlFFF4BYRo7tFEupUyUppbcnbVmwmj-jeX_xrsl5rKMn5JZBwhhk92-TTyU4QsKBYwLAuTojwyzNtJAgpJQoLsmwbUMBCqRG0PKKvLya77AyXYgrOv0xrTWVp8uNb_pTHU1FH01n6LhP-y7Y9oHOmnpN53Eb3fGDdnW_tWH11d2Qi9JUrR_-zQH5mE3fJ8-jxfJpPhkvRkZw3Y0QTMGURaeQedTCy8ypUhQ29SgQnc7KAp1XpeZCOZs6KzTXTBfAU5tiJgYET73buDH7namqfNOEtWn2OYP8oCJv7e6oIj-oyI8qeuzuhAXv_T9Cc5CpFr81nF5m | 
    
| CODEN | IEEPAD | 
    
| ContentType | Conference Proceeding | 
    
| DBID | 6IE 6IL CBEJK RIE RIL ADTOC UNPAY  | 
    
| DOI | 10.1109/SCW63240.2024.00226 | 
    
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE/IET Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present Unpaywall for CDI: Periodical Content Unpaywall  | 
    
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher – sequence: 2 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository  | 
    
| DeliveryMethod | fulltext_linktorsrc | 
    
| EISBN | 9798350355543 | 
    
| EndPage | 1804 | 
    
| ExternalDocumentID | oai:osti.gov:2538413 10820578  | 
    
| Genre | orig-research | 
    
| GrantInformation_xml | – fundername: Oak Ridge National Laboratory funderid: 10.13039/100006228 – fundername: U.S. Department of Energy funderid: 10.13039/100000015  | 
    
| GroupedDBID | 6IE 6IL ACM ALMA_UNASSIGNED_HOLDINGS CBEJK RIE RIL ADTOC UNPAY  | 
    
| ID | FETCH-LOGICAL-a328t-40ab16c4d641e483e59d6f3bc7e4344d89fb4de6f8236dc7dc382818b027c7493 | 
    
| IEDL.DBID | RIE | 
    
| IngestDate | Sun Oct 26 04:17:49 EDT 2025 Wed Aug 27 01:59:34 EDT 2025  | 
    
| IsDoiOpenAccess | false | 
    
| IsOpenAccess | true | 
    
| IsPeerReviewed | false | 
    
| IsScholarly | false | 
    
| Language | English | 
    
| LinkModel | DirectLink | 
    
| MergedId | FETCHMERGED-LOGICAL-a328t-40ab16c4d641e483e59d6f3bc7e4344d89fb4de6f8236dc7dc382818b027c7493 | 
    
| OpenAccessLink | https://proxy.k.utb.cz/login?url=http://www.osti.gov/servlets/purl/2538413 | 
    
| PageCount | 10 | 
    
| ParticipantIDs | ieee_primary_10820578 unpaywall_primary_10_1109_scw63240_2024_00226  | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | 2024-Nov.-17 | 
    
| PublicationDateYYYYMMDD | 2024-11-17 | 
    
| PublicationDate_xml | – month: 11 year: 2024 text: 2024-Nov.-17 day: 17  | 
    
| PublicationDecade | 2020 | 
    
| PublicationTitle | SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis | 
    
| PublicationTitleAbbrev | SC-W | 
    
| PublicationYear | 2024 | 
    
| Publisher | IEEE | 
    
| Publisher_xml | – name: IEEE | 
    
| SSID | ssib060584085 | 
    
| Score | 1.8925397 | 
    
| Snippet | In this paper, we address the challenges in achieving sustainable data-driven efficiency by providing a detailed exploration of the end-to-end operational data... | 
    
| SourceID | unpaywall ieee  | 
    
| SourceType | Open Access Repository Publisher  | 
    
| StartPage | 1795 | 
    
| SubjectTerms | Complexity theory Data analysis Data Analytics Data Governance HPC Post-Exascale Challenges Machine Learning Applications Monitoring Operational Data Analytics Organizations Pipelines Software Standards organizations Streams Stress Subject matter experts Sustainable development Telemetry Visual Analytics  | 
    
| SummonAdditionalLinks | – databaseName: Unpaywall dbid: UNPAY link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlZ07T8MwEMct1A6IBRBFgAB5YE2bOq7jsCFoVSpROlCpTJFfYaAkUR4E-PT40rRUYkCMeTiy7mz57nT5_RG6ilzBCJfaEQGlDg04c2zczB3hcSkoCKkaqHc8TNl4TieLwWILFl1V3cSu7bqlEsqSdvJ5Ly2zZY_YrUlBn7bNBjbsbqH2fDq7eW5IQn036OWqAvi4a7M-AkxsAsSEWjNlD-2WcSo-K7Fcbh0fo_1V41VeUweha-S1Wxayq75-Mxn_ntkB6vz8pYdnm_PnEO2Y-AhNpuK9pmbEL3j4IXLrAoMfU5M1RT98JwqBaxQJAJqv8ShL3vB9vBZXwkVir3JI2TtoPho-3Y6dRi_BWpbwwqaCQvaZoprRvqHcM4NAs8iTyjfUo1TzIJJUGxaByLlWvlYeBxiUtKmp8mngHaNWnMTmBGGmJLFjmRJuRIUKJPE5k1py-xnpKn6KOmDXMF0hMQAzSmzkZ-87G0NvPQzBP-HaPyH4J6z9c_bP989Rq8hKc2EDgUJeNkvgG-X4tb4 priority: 102 providerName: Unpaywall  | 
    
| Title | Navigating Exascale Operational Data Analytics: From Inundation to Insight | 
    
| URI | https://ieeexplore.ieee.org/document/10820578 http://www.osti.gov/servlets/purl/2538413  | 
    
| UnpaywallVersion | submittedVersion | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEN4oHtSLGjHig-zBa7GPZbv1ZhCCJFYSJeKp2Vc9iIVAEfXXu7O0SExMvLVpu9nMN83szO73DUIXqcupz4RyeESIQyJGHbNuZg4PmOAEGqlqqHfcxbQ7IL1hc1iQ1S0XRmttD5_pBlzavXw1lnMolZk_3MQr42KbaDNkdEnWKp0HtvdAratQFvLc6PKh9QRi5K7JAn3QyPZBQcH2UNlF2_Nswj8XfDRaCyedPRSXE1meInltzHPRkF-_NBr_PdN9VP1h7uH-KiYdoA2dHaJezN-tkkb2gtsffGZg0fh-oqdFIRDf8JxjK08Cos1XuDMdv-HbrGy4hPOxuZtBGl9Fg077sdV1ih4Kxto-y016yIVHJVGUeJqwQDcjRdNAyFCTgBDFolQQpWkKjc-VDJUMGAhECZOuypBEwRGqZONMHyNMpfDNt1RyNyVcRsI3IAglmBlGuJLVUBUMkUyWMhlJaYMaclbGXnuYAEbJTC4sRglglFiMTv4Y5xTtwEtABPTCM1TJp3N9blYEuahbT6ijrUHcv37-BqHttgM | 
    
| linkProvider | IEEE | 
    
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjZ1LT8MwDIAjGAfgAogh3uTAtaNrvTTlCkxjbAMJJrhVeZUD0E1by4BfT5x1Y0JC4taqShTZrhw78WdCTlNfsIBL7YkYwIOYM8_um7knQi4FYCNVg_mObo-1-tB-ajyVxequFsYY4y6fmRo-urN8PVAFpsrsH279lTWxZbLSAIDGtFxrZj54wIe8rpItVPfjs_uLR8SR-zYODJCSHSBDwXVRWSerRTYUnxPx-rrgUJobpDdbyvQeyUutyGVNff2iNP57rZuk-lO7R-_mXmmLLJlsm7R74t2xNLJnevUhxlYxht4OzahMBdJLkQvqACWIbT6nzdHgjV5ns5ZLNB_YtzEG8lXSb149XLS8souClXfAcxsgCllnCjSDugEemkasWRpKFRkIATSPUwnasBRbn2sVaRVyRERJG7CqCOJwh1SyQWZ2CWVKBnYsU8JPQahYBhFnUktup5G-4nukioJIhlNQRjKTwR7x5sJe-JigjpKxmjgdJaijxOlo_495Tshq66HbSTrXvZsDsoYDsCywHh2SSj4qzJHdH-Ty2FnFN5hgt6A | 
    
| linkToUnpaywall | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlZ07T8MwEMct1A6IBRBFgAB5YE2bOq7jsCFoVSpROlCpTJFfYaAkUR4E-PT40rRUYkCMeTiy7mz57nT5_RG6ilzBCJfaEQGlDg04c2zczB3hcSkoCKkaqHc8TNl4TieLwWILFl1V3cSu7bqlEsqSdvJ5Ly2zZY_YrUlBn7bNBjbsbqH2fDq7eW5IQn036OWqAvi4a7M-AkxsAsSEWjNlD-2WcSo-K7Fcbh0fo_1V41VeUweha-S1Wxayq75-Mxn_ntkB6vz8pYdnm_PnEO2Y-AhNpuK9pmbEL3j4IXLrAoMfU5M1RT98JwqBaxQJAJqv8ShL3vB9vBZXwkVir3JI2TtoPho-3Y6dRi_BWpbwwqaCQvaZoprRvqHcM4NAs8iTyjfUo1TzIJJUGxaByLlWvlYeBxiUtKmp8mngHaNWnMTmBGGmJLFjmRJuRIUKJPE5k1py-xnpKn6KOmDXMF0hMQAzSmzkZ-87G0NvPQzBP-HaPyH4J6z9c_bP989Rq8hKc2EDgUJeNkvgG-X4tb4 | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=SC24-W%3A+Workshops+of+the+International+Conference+for+High+Performance+Computing%2C+Networking%2C+Storage+and+Analysis&rft.atitle=Navigating+Exascale+Operational+Data+Analytics%3A+From+Inundation+to+Insight&rft.au=Shin%2C+Woong&rft.au=Osborne%2C+Tim&rft.au=Karimi%2C+Ahmad+Maroof&rft.au=Palumbo%2C+Rachel&rft.date=2024-11-17&rft.pub=IEEE&rft.spage=1795&rft.epage=1804&rft_id=info:doi/10.1109%2FSCW63240.2024.00226&rft.externalDocID=10820578 |