Delay Deterministic Cell-Free MIMO Transmission via Safety Reinforcement Learning

Deterministic communication within the wireless domain is essential for industrial applications. However, the stochastic nature of wireless communication introduces substantial challenges for time-sensitive networking (TSN) services, which require strict end-to-end latency bounds. This paper address...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on wireless communications p. 1
Main Authors	Meng, Fan, Zhang, Cheng, Huang, Yongming, You, Xiaohu
Format	Journal Article
Language	English
Published	IEEE 2025
Subjects	Array signal processing cell-free Delays deterministic communication Downlink Jitter MIMO Optimization Precoding Resource management safety reinforcement learning time-space-frequency precoding Ultra reliable low latency communication Wireless communication WMMSE
Online Access	Get full text
ISSN	1536-1276 1558-2248
DOI	10.1109/TWC.2025.3601221

Cover

More Information
Summary:	Deterministic communication within the wireless domain is essential for industrial applications. However, the stochastic nature of wireless communication introduces substantial challenges for time-sensitive networking (TSN) services, which require strict end-to-end latency bounds. This paper addresses the challenge of minimizing long-term delay jitter in downlink cell-free multi-user multi-input multi-output orthogonal frequency division multiple access (MU-MIMO OFDMA) systems, subject to heterogeneous delay upper bounds and satisfaction rates. The problem involves time-space-frequency precoding constrained by user-specific delay violation probabilities and transmit power limits. To overcome the limitations of model-driven methods in handling implicit system models and the inefficiency of data-driven approaches in large action spaces, we propose a hybrid solution. Specifically, we decompose the problem into two sub-problems: rate scheduling via a constrained Markov decision process (CMDP), and instantaneous precoding through weighted sum-rate (WSR) maximization. We develop a safety reinforcement learning-based algorithm to optimize rate scheduling by allocating user weights, and a weighted minimum mean squared error (WMMSE) algorithm to solve the WSR maximization. Simulation results demonstrate that our approach effectively reduces jitter while meeting stringent delay-related requirements. In diverse TSN scenarios with heavy loading ratio, our proposed co-driven scheme achieves about 45% reduction in delay jitter compared to earliest deadline first (EDF) scheduling, while realizing user-specific delay satisfactory ratios (99.9%-99.999%).
ISSN:	1536-1276 1558-2248
DOI:	10.1109/TWC.2025.3601221