A Versatile Simulated Data Transport Layer for in Situ Workflows Performance Evaluation
In situ processing does not only allow scientific applications to face the explosion in data volume and velocity but also to address the time constraints of many simulation-analysis workflows by providing scientists with early insights about their applications at runtime. Multiple frameworks impleme...
Saved in:
| Published in | Proceedings / IEEE International Conference on Cluster Computing pp. 1 - 11 |
|---|---|
| Main Author | |
| Format | Conference Proceeding |
| Language | English |
| Published |
IEEE
02.09.2025
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 2168-9253 |
| DOI | 10.1109/CLUSTER59342.2025.11186460 |
Cover
| Summary: | In situ processing does not only allow scientific applications to face the explosion in data volume and velocity but also to address the time constraints of many simulation-analysis workflows by providing scientists with early insights about their applications at runtime. Multiple frameworks implement the concept of a data transport layer (DTL) to enable such in situ workflows. These tools are very versatile, directly or indirectly access the data generated on the same node, another node of the same compute cluster, or a completely distinct node, and allow data publishers and subscribers to run on the same computing resources or not. This versatility puts on researchers the onus of taking key decisions related to resource allocation and how to transport data to ensure the most efficient execution of their in situ workflows. However, domain scientists and workflow practitioners lack the appropriate tools to assess the respective performance of particular design and deployment options. In this paper we introduce a versatile simulated DTL designed to provide researchers with insights on the respective performance of different execution scenarios of in situ workflows. This open-source, standalone library builds on the SimGrid toolkit and can be linked to any SimGrid-based simulator. It facilitates the evaluation of the performance behavior, at scale, of different data transport configurations and the study of the effects of resource allocation strategies. We demonstrate the scalability, versatility, and accuracy of this simulated DTL by reproducing the execution of two synthetic benchmarks and of a real-world in situ workflow composed of an MPI application and a parallel data analysis. Results of simulations run on a single core show that the proposed library can simulate the interactions of tens of thousands of simulated processes deployed on two interconnected commodity clusters in a few seconds, and the execution by a thousand simulated processes of an in situ workflow in less than three minutes. |
|---|---|
| ISSN: | 2168-9253 |
| DOI: | 10.1109/CLUSTER59342.2025.11186460 |