Minimizing the AoI in Resource-Constrained Multi-Source Relaying Systems: Dynamic and Learning-based Scheduling

We consider a multi-source relaying system where independent sources randomly generate status update packets which are sent to the destination with the aid of a relay through unreliable links. We develop transmission scheduling policies to minimize the weighted sum average age of information (AoI) s...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on wireless communications Vol. 23; no. 1; p. 1
Main Authors	Zakeri, Abolfazl, Moltafet, Mohammad, Leinonen, Markus, Codreanu, Marian
Format	Journal Article
Language	English
Published	New York IEEE 01.01.2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Age of information (AoI) Algorithms constrained Markov decision process (CMDP) Constraints Deep learning deep reinforcement learning Drift drift-plus-penalty Dynamic scheduling Markov processes Optimal control Optimal scheduling Optimization Policies Relaxation method (mathematics) relay Relaying Relays Scheduling Sensor systems Stochastic processes Transmitters Wireless communication Wireless sensor networks constrained Markov decision process (CMDP) Age of information (AoI) deep reinforcement learning drift-plus-penalty relay
Online Access	Get full text
ISSN	1536-1276 1558-2248 1558-2248
DOI	10.1109/TWC.2023.3278460

Cover

More Information
Summary:	We consider a multi-source relaying system where independent sources randomly generate status update packets which are sent to the destination with the aid of a relay through unreliable links. We develop transmission scheduling policies to minimize the weighted sum average age of information (AoI) subject to transmission capacity and long-run average resource constraints. We formulate a stochastic control optimization problem and solve it using a constrained Markov decision process (CMDP) approach and a drift-plus-penalty method. The CMDP problem is solved by transforming it into an MDP problem using the Lagrangian relaxation method. We theoretically analyze the structure of optimal policies for the MDP problem and subsequently propose a structure-aware algorithm that returns a practical near-optimal policy. Using the drift-plus-penalty method, we devise a near-optimal low-complexity policy that performs the scheduling decisions dynamically. We also develop a model-free deep reinforcement learning policy for which the Lyapunov optimization theory and a dueling double deep Q-network are employed. The complexities of the proposed policies are analyzed. Simulation results are provided to assess the performance of our policies and validate the theoretical results. The results show up to 91% performance improvement compared to a baseline policy.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1536-1276 1558-2248 1558-2248
DOI:	10.1109/TWC.2023.3278460