Data Logistics Service in eFlows4HPC
Modern scientific endeavors often require complex, data-intensive workflows leveraging distributed and heterogeneous computing and data resources. Such workflows often include multiple steps of classical simulations, but increasingly also ML and AI components. As a result, they use not only HPC, but...
Saved in:
Published in | 2024 47th MIPRO ICT and Electronics Convention (MIPRO) pp. 892 - 897 |
---|---|
Main Authors | , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
20.05.2024
|
Subjects | |
Online Access | Get full text |
ISSN | 2623-8764 |
DOI | 10.1109/MIPRO60963.2024.10569664 |
Cover
Abstract | Modern scientific endeavors often require complex, data-intensive workflows leveraging distributed and heterogeneous computing and data resources. Such workflows often include multiple steps of classical simulations, but increasingly also ML and AI components. As a result, they use not only HPC, but also Cloud-like resources. Efficient and user-friendly execution and management of such workflows pose many challenges. In this paper, we share our experience in implementing three such workflows in the eFlows4HPC project. We focus, however, on the data management dimension of the workflows. How to ensure the timely availability of the required data, how to move data to and from compute resources, and how to make the workflows complete and portable. To this end, we implemented the Data Logistics Service, integrated it with the workflow execution engine, and defined multiple data movement pipelines to cater for specific scientific needs. We will share our experience from implementation and operation of the service. This will include building a solution for continuous deployment and access management in a federated environment. On a more abstract level, we also explore how the presented approach fits into the vision of the FAIR paradigm. |
---|---|
AbstractList | Modern scientific endeavors often require complex, data-intensive workflows leveraging distributed and heterogeneous computing and data resources. Such workflows often include multiple steps of classical simulations, but increasingly also ML and AI components. As a result, they use not only HPC, but also Cloud-like resources. Efficient and user-friendly execution and management of such workflows pose many challenges. In this paper, we share our experience in implementing three such workflows in the eFlows4HPC project. We focus, however, on the data management dimension of the workflows. How to ensure the timely availability of the required data, how to move data to and from compute resources, and how to make the workflows complete and portable. To this end, we implemented the Data Logistics Service, integrated it with the workflow execution engine, and defined multiple data movement pipelines to cater for specific scientific needs. We will share our experience from implementation and operation of the service. This will include building a solution for continuous deployment and access management in a federated environment. On a more abstract level, we also explore how the presented approach fits into the vision of the FAIR paradigm. |
Author | Bottcher, Christian Rybicki, Jedrzej |
Author_xml | – sequence: 1 givenname: Jedrzej surname: Rybicki fullname: Rybicki, Jedrzej email: j.rybicki@fz-juelich.de organization: Juelich Supercompuging Center,Juelich,Germany – sequence: 2 givenname: Christian surname: Bottcher fullname: Bottcher, Christian email: c.boettcher@fz-juelich.de organization: Juelich Supercompuging Center,Juelich,Germany |
BookMark | eNo1j0tLw0AURkdRsNb8AxdZuE29d17JXUq0thBp8bEuk5sZGaiJZILiv29BXX2rczjfpTjrh94LkSMsEIFun9bb540FsmohQeoFgrFkrT4RGZVUKQOqkgbwVMyklaqoSqsvRJZSbEFLXcIRnYmbeze5vBneY5oip_zFj1-RfR773C_3w3fSq219Jc6D2yef_e1cvC0fXutV0Wwe1_VdU0REmgoECAqZGNm1DIxdQPCAOqguGGyppaCOUWyMUSAradl1FWliI4lbq-bi-tcbvfe7zzF-uPFn939MHQCzQ0J6 |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/MIPRO60963.2024.10569664 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
EISBN | 9798350382501 9798350382495 |
EISSN | 2623-8764 |
EndPage | 897 |
ExternalDocumentID | 10569664 |
Genre | orig-research |
GrantInformation_xml | – fundername: Ministry of Education funderid: 10.13039/501100002701 |
GroupedDBID | 6IE 6IL ALMA_UNASSIGNED_HOLDINGS CBEJK M43 RIE RIL |
ID | FETCH-LOGICAL-i119t-100f31c9c1cabc0c1df10e014f3df51b9b9f3382c555302826cad8949c529cb63 |
IEDL.DBID | RIE |
IngestDate | Wed Aug 27 02:06:46 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i119t-100f31c9c1cabc0c1df10e014f3df51b9b9f3382c555302826cad8949c529cb63 |
PageCount | 6 |
ParticipantIDs | ieee_primary_10569664 |
PublicationCentury | 2000 |
PublicationDate | 2024-May-20 |
PublicationDateYYYYMMDD | 2024-05-20 |
PublicationDate_xml | – month: 05 year: 2024 text: 2024-May-20 day: 20 |
PublicationDecade | 2020 |
PublicationTitle | 2024 47th MIPRO ICT and Electronics Convention (MIPRO) |
PublicationTitleAbbrev | MIPRO |
PublicationYear | 2024 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssib042470096 |
Score | 1.8837678 |
Snippet | Modern scientific endeavors often require complex, data-intensive workflows leveraging distributed and heterogeneous computing and data resources. Such... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 892 |
SubjectTerms | Buildings Cloud Computational modeling Data transfer Distributed Data Distributed databases Full stack High Performance Computing Pipelines Reproducibility of results |
Title | Data Logistics Service in eFlows4HPC |
URI | https://ieeexplore.ieee.org/document/10569664 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NSwMxEA22J08qVvxmD73uNtlkN-ZcXarYuoiF3koym0BRtmK3CP56J2lXURC8hYGEfJGXl8ybIaQvwCIuIE0VUlkkKBpiXfmAl1xyvF2DZcY_DYwn-Wgq7mbZbCtWD1oYa21wPrOJL4a__GoJa_9UNvBZ4rEB0SEdKdVGrNVuHpEK6e_jrbcOVYPxbfn4kKONIw9MRdJW_5FIJeBIsUcmbQ827iPPyboxCXz8Cs747y7uk963ZC8qv8DogOzY-pD0r3Wjo_sg8lnAKtoeDNGijmzxsnxfiVE57JFpcfM0HMXbtAjxgjHV4MFJHWeggIE2QIFVjlGLVMfxymXMKKMcEs8UspASCPkD6OpKCQVZqsDk_Ih062Vtj0lkKwqaOZlLZxDZqeHUIGClLneZEOBOSM8Pcf66iXwxb0d3-of9jOz6mfa_6yk9J93mbW0vELQbcxkW6xNL9ZPt |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NSwMxEA1aD3pSseK3e-i122ST3TXnatlqW4u00FtJZhMoyq7YLYK_3knaVRQEb2EOIV_kzUvmzRDSEmAQF5CmilQaJCgK2ip3CS95ytG7BsO0exoYjpJsKu5n8WwjVvdaGGOMDz4zoWv6v_y8hJV7Kuu4KvHYgdgmOzHSinQt16qPj4hE6jzyOl6Hys6wP356TNDGkQlGIqw7-FFKxSNJb5-M6jGsA0iew1WlQ_j4lZ7x34M8IM1v0V4w_oKjQ7JliiPSulWVCgZe5rOAZbC5GoJFEZjeS_m-FNm42yTT3t2km7U3hRHaC8ZkhVcntZyBBAZKAwWWW0YNkh3LcxszLbW0SD0jiH1RIGQQoPIbKSTEkQSd8GPSKMrCnJDA5BQUs2mSWo3YTjWnGiErsonFJQZ7SppuivPXde6LeT27sz_s12Q3mwwH80F_9HBO9tyqu7_2iF6QRvW2MpcI4ZW-8hv3CUmWlz4 |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2024+47th+MIPRO+ICT+and+Electronics+Convention+%28MIPRO%29&rft.atitle=Data+Logistics+Service+in+eFlows4HPC&rft.au=Rybicki%2C+Jedrzej&rft.au=Bottcher%2C+Christian&rft.date=2024-05-20&rft.pub=IEEE&rft.eissn=2623-8764&rft.spage=892&rft.epage=897&rft_id=info:doi/10.1109%2FMIPRO60963.2024.10569664&rft.externalDocID=10569664 |