Collaborative Learning-Based Scheduling for Kubernetes-Oriented Edge-Cloud Network

Kubernetes ( k8s ) has the potential to coordinate distributed edge resources and centralized cloud resources, but currently lacks a specialized scheduling framework for edge-cloud networks. Besides, the hierarchical distribution of heterogeneous resources makes the modeling and scheduling of k8s -o...

Full description

Saved in:
Bibliographic Details
Published inIEEE/ACM transactions on networking Vol. 31; no. 6; pp. 1 - 15
Main Authors Shen, Shihao, Han, Yiwen, Wang, Xiaofei, Wang, Shiqiang, Leung, Victor C. M.
Format Journal Article
LanguageEnglish
Published New York IEEE 01.12.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN1063-6692
1558-2566
DOI10.1109/TNET.2023.3267168

Cover

More Information
Summary:Kubernetes ( k8s ) has the potential to coordinate distributed edge resources and centralized cloud resources, but currently lacks a specialized scheduling framework for edge-cloud networks. Besides, the hierarchical distribution of heterogeneous resources makes the modeling and scheduling of k8s -oriented edge-cloud network particularly challenging. In this paper, we introduce KaiS , a learning-based scheduling framework for such edge-cloud network to improve the long-term throughput rate of request processing. First, we design a coordinated multi-agent actor-critic algorithm to cater to decentralized request dispatch and dynamic dispatch spaces within the edge cluster. Second, for diverse system scales and structures, we use graph neural networks to embed system state information, and combine the embedding results with multiple policy networks to reduce the orchestration dimensionality by stepwise scheduling. Finally, we adopt a two-time-scale scheduling mechanism to harmonize request dispatch and service orchestration, and present the implementation design of deploying the above algorithms compatible with native k8s components. Experiments using real workload traces show that KaiS can successfully learn appropriate scheduling policies, irrespective of request arrival patterns and system scales. Moreover, KaiS can enhance the average system throughput rate by <inline-formula> <tex-math notation="LaTeX">15.9\%</tex-math> </inline-formula> while reducing scheduling cost by <inline-formula> <tex-math notation="LaTeX">38.4\%</tex-math> </inline-formula> compared to baselines.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1063-6692
1558-2566
DOI:10.1109/TNET.2023.3267168