Joint Deep Reinforcement Learning and Unfolding: Beam Selection and Precoding for mmWave Multiuser MIMO With Lens Arrays

The millimeter wave (mmWave) multiuser multiple-input multiple-output (MU-MIMO) systems with discrete lens arrays (DLA) have received great attention due to their simple hardware implementation and excellent performance. In this work, we investigate the joint design of beam selection and digital pre...

Full description

Saved in:

Bibliographic Details
Published in	IEEE journal on selected areas in communications Vol. 39; no. 8; pp. 2289 - 2304
Main Authors	Hu, Qiyu, Liu, Yanzhen, Cai, Yunlong, Yu, Guanding, Ding, Zhi
Format	Journal Article
Language	English
Published	New York IEEE 01.08.2021 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Arrays Artificial neural networks beam selection Codes Deep learning deep reinforcement learning deep-unfolding Discrete lens arrays Iterative algorithms Lenses Machine learning Markov processes Mean square error methods Millimeter waves Neural networks Precoding precoding design Radio equipment Reinforcement learning Robustness Simulation
Online Access	Get full text
ISSN	0733-8716 1558-0008
DOI	10.1109/JSAC.2021.3087233

Cover

More Information
Summary:	The millimeter wave (mmWave) multiuser multiple-input multiple-output (MU-MIMO) systems with discrete lens arrays (DLA) have received great attention due to their simple hardware implementation and excellent performance. In this work, we investigate the joint design of beam selection and digital precoding matrices for mmWave MU-MIMO systems with DLA to maximize the sum-rate subject to the transmit power constraint and the constraints of the selection matrix structure. The investigated non-convex problem with discrete variables and coupled constraints is challenging to solve and an efficient framework of joint neural network (NN) design is proposed to tackle it. Specifically, the proposed framework consists of a deep reinforcement learning (DRL)-based NN and a deep-unfolding NN, which are employed to optimize the beam selection and digital precoding matrices, respectively. As for the DRL-based NN, we formulate the beam selection problem as a Markov decision process and a double deep Q-network algorithm is developed to solve it. The base station is considered to be an agent, where the state, action, and reward function are carefully designed. Regarding the design of the digital precoding matrix, we develop an iterative weighted minimum mean-square error algorithm induced deep-unfolding NN, which unfolds this algorithm into a layer-wise structure with introduced trainable parameters. Simulation results verify that this jointly trained NN remarkably outperforms the existing iterative algorithms with reduced complexity and stronger robustness.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0733-8716 1558-0008
DOI:	10.1109/JSAC.2021.3087233