Deadline-Aware Resource Allocation and Scheduling of Serverless Workloads on Heterogeneous Clusters

Serverless computing has become widely adopted as a cloud deployment model due to its ease of use and finegrained pay-as-you-go pricing. By hiding infrastructure complexity, it simplifies access to cloud resources and lets developers focus on application code. However, most serverless platforms oper...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings / IEEE International Conference on Cluster Computing pp. 1 - 11
Main Authors	Fritz, Matthias, Benkner, Siegfried, Bajrovic, Enes
Format	Conference Proceeding
Language	English
Published	IEEE 02.09.2025
Subjects	Dynamic Resource Tuning Dynamic scheduling Heterogeneous Clusters Manuals Measurement Optimization Pricing Processor scheduling Real-time systems Reliability Resource Allocation Scheduling Serverless Serverless computing Tuning
Online Access	Get full text
ISSN	2168-9253
DOI	10.1109/CLUSTER59342.2025.11186486

Cover

More Information
Summary:	Serverless computing has become widely adopted as a cloud deployment model due to its ease of use and finegrained pay-as-you-go pricing. By hiding infrastructure complexity, it simplifies access to cloud resources and lets developers focus on application code. However, most serverless platforms operate on a best-effort basis and provide minimal control over performance tuning. Combined with limited visibility into underlying hardware, this makes it difficult to reliably meet Service Level Objectives (SLOs). To address this, we introduce DHRT, a deadline- and heterogeneity-aware scheduling and resource allocation framework for performance-critical serverless workloads. DHRT applies heuristic-driven online optimisation to iteratively refine resource estimates by leveraging real-time metrics and historical data from live executions. To fulfil SLOs, it accounts for both workload characteristics and node heterogeneity. We evaluate DHRT on synthetic workloads by comparing it against baseline scheduling and resource allocation policies commonly used in FaaS platforms. Results show that DHRT accurately estimates resource demands within a few live executions, eliminating the need for manual resource tuning. By exploiting node heterogeneity and dynamically scaling vCPU allocations as workloads near their deadlines, DHRT improves resource efficiency and significantly reduces deadline violations.
ISSN:	2168-9253
DOI:	10.1109/CLUSTER59342.2025.11186486