Task offloading in Multiple-Services Mobile Edge Computing: A deep reinforcement learning algorithm

Multiple-Services Mobile Edge Computing enables task-relate services cached in edge server to be dynamically updated, and thus provides great opportunities to offload tasks to edge server for execution. However, the requirements and popularity of services, the computing requirement and the amount of...

Full description

Saved in:

Bibliographic Details
Published in	Computer communications Vol. 202; pp. 1 - 12
Main Authors	Peng, Ziyu, Wang, Gaocai, Nong, Wang, Qiu, Yu, Huang, Shuqiang
Format	Journal Article
Language	English
Published	Elsevier B.V 15.03.2023
Subjects	Deep reinforcement learning Mobile edge computing Resource allocation Service caching Task offloading Deep reinforcement learning Mobile edge computing Task offloading Resource allocation Service caching
Online Access	Get full text
ISSN	0140-3664 1873-703X
DOI	10.1016/j.comcom.2023.02.001

Cover

More Information
Summary:	Multiple-Services Mobile Edge Computing enables task-relate services cached in edge server to be dynamically updated, and thus provides great opportunities to offload tasks to edge server for execution. However, the requirements and popularity of services, the computing requirement and the amount of data transferred from users to edge server are dynamic with time. How to adaptively adjust the subset of total service types in the resource-limited edge server and determine the task offloading destination and resource allocation decisions to improve the overall system performance is a challenge problem. To solve this challenge, we firstly convert it into a Markov decision process, then propose a soft actor–critic deep reinforcement learning-based algorithm, called DSOR, to jointly determine not only the discrete decisions of service caching and task offloading but also the continuous allocation of bandwidth and computing resource. To improve the accuracy of our algorithm, we employ an efficient trick of converting the discrete action selection into a continuous space to deal with the key design challenge that arises from continuous-discrete hybrid action space. Additionally, to improve resource utilization, a novel reward function is integrated to our algorithm to speed up the convergence of training while making full use of system resources. Extensive numerical results show that compared with other baseline algorithms, our algorithm can effectively reduce the long-term average completion delay of tasks while accessing excellent performance in terms of stability.
ISSN:	0140-3664 1873-703X
DOI:	10.1016/j.comcom.2023.02.001