SMDP-Based Coordinated Virtual Machine Allocations in Cloud-Fog Computing Systems

Heterogeneous computing powered by remote clouds and local fogs is a promising technology to improve the performance of user terminals in the Internet of Things. In this paper, two semi-Markov decision process (SMDP)-based coordinated virtual machine (VM) allocation methods are proposed to balance t...

Full description

Saved in:

Bibliographic Details
Published in	IEEE internet of things journal Vol. 5; no. 3; pp. 1977 - 1988
Main Authors	Li, Qizhen, Zhao, Lianwen, Gao, Jie, Liang, Hongbin, Zhao, Lian, Tang, Xiaohu
Format	Journal Article
Language	English
Published	Piscataway IEEE 01.06.2018 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Allocations Cloud computing Cloud-fog computing systems Computational modeling Computer simulation Computing costs Heuristic algorithms Internet of Things Iterative algorithms Iterative methods Learning (artificial intelligence) Markov analysis Markov chains model-based planning model-free reinforcement learning (RL) Performance enhancement Planning Resource management semi-Markov decision process (SMDP) Transition probabilities Virtual environments virtual machine (VM) allocation
Online Access	Get full text
ISSN	2327-4662 2327-4662
DOI	10.1109/JIOT.2018.2818680

Cover

More Information
Summary:	Heterogeneous computing powered by remote clouds and local fogs is a promising technology to improve the performance of user terminals in the Internet of Things. In this paper, two semi-Markov decision process (SMDP)-based coordinated virtual machine (VM) allocation methods are proposed to balance the tradeoff between the high cost of providing services by the remote cloud and the limited computing capacity of the local fog. We first present a model-based planning method in which it is necessary to train the state transition probabilities and the expected time intervals between adjacent decision epochs. To facilitate training them, the SMDP is degraded into a continuous-time Markov decision process (CTMDP) in which the service requests and ongoing service completions follow a continuous-time Markov chain. The relative value iterative algorithm for the CTMDP is used to find an asymptotically optimal VM allocation policy. In addition, we also propose a model-free reinforcement learning (RL) method, where an optimal coordinated VM allocation policy is approximated by learning from the states and rewards of feedback. The simulation results show that the performance of the model-free RL method can converge to a level similar to that of the model-based planning method and outperform the greedy VM allocation method.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2327-4662 2327-4662
DOI:	10.1109/JIOT.2018.2818680