Delay Deterministic Cell-Free MIMO Transmission via Safety Reinforcement Learning

Deterministic communication within the wireless domain is essential for industrial applications. However, the stochastic nature of wireless communication introduces substantial challenges for time-sensitive networking (TSN) services, which require strict end-to-end latency bounds. This paper address...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on wireless communications p. 1
Main Authors Meng, Fan, Zhang, Cheng, Huang, Yongming, You, Xiaohu
Format Journal Article
LanguageEnglish
Published IEEE 2025
Subjects
Online AccessGet full text
ISSN1536-1276
1558-2248
DOI10.1109/TWC.2025.3601221

Cover

More Information
Summary:Deterministic communication within the wireless domain is essential for industrial applications. However, the stochastic nature of wireless communication introduces substantial challenges for time-sensitive networking (TSN) services, which require strict end-to-end latency bounds. This paper addresses the challenge of minimizing long-term delay jitter in downlink cell-free multi-user multi-input multi-output orthogonal frequency division multiple access (MU-MIMO OFDMA) systems, subject to heterogeneous delay upper bounds and satisfaction rates. The problem involves time-space-frequency precoding constrained by user-specific delay violation probabilities and transmit power limits. To overcome the limitations of model-driven methods in handling implicit system models and the inefficiency of data-driven approaches in large action spaces, we propose a hybrid solution. Specifically, we decompose the problem into two sub-problems: rate scheduling via a constrained Markov decision process (CMDP), and instantaneous precoding through weighted sum-rate (WSR) maximization. We develop a safety reinforcement learning-based algorithm to optimize rate scheduling by allocating user weights, and a weighted minimum mean squared error (WMMSE) algorithm to solve the WSR maximization. Simulation results demonstrate that our approach effectively reduces jitter while meeting stringent delay-related requirements. In diverse TSN scenarios with heavy loading ratio, our proposed co-driven scheme achieves about 45% reduction in delay jitter compared to earliest deadline first (EDF) scheduling, while realizing user-specific delay satisfactory ratios (99.9%-99.999%).
ISSN:1536-1276
1558-2248
DOI:10.1109/TWC.2025.3601221