Joint Power Allocation and User Fairness Optimization for Reinforcement Learning Over mmWave-NOMA Heterogeneous Networks
In this paper, the problem of joint power allocation and user fairness is investigated for an mmWave heterogeneous network (HetNet) including hybrid non-orthogonal multiple access (NOMA) and orthogonal multiple access (OMA) transmissions. In particular, in the assumed realistic model, the uplink (UL...
        Saved in:
      
    
          | Published in | IEEE transactions on vehicular technology Vol. 73; no. 9; pp. 12962 - 12977 | 
|---|---|
| Main Authors | , , , , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
        New York
          IEEE
    
        01.09.2024
     The Institute of Electrical and Electronics Engineers, Inc. (IEEE)  | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 0018-9545 1939-9359  | 
| DOI | 10.1109/TVT.2024.3386587 | 
Cover
| Summary: | In this paper, the problem of joint power allocation and user fairness is investigated for an mmWave heterogeneous network (HetNet) including hybrid non-orthogonal multiple access (NOMA) and orthogonal multiple access (OMA) transmissions. In particular, in the assumed realistic model, the uplink (UL) of macrocell users (MCUs) and the downlink (DL) of small cell users (SCUs) share the same resource block (RB) to increase the network capacity. Furthermore, an imperfect successive interference cancelation (SIC) decoding is considered due to hardware impairment of real-world NOMA systems. Based on the number of RBs and clusters, we consider two cases as fully resource allocation and partially resource allocation. The multi-objective optimization problem (MOOP), i.e., user fairness maximization and transmission power minimization is transformed into single objective optimization problem (SOOP) by weighted sum (WS) and <inline-formula><tex-math notation="LaTeX">\varepsilon</tex-math></inline-formula>-constraint (EC) methods. Several types of reinforcement learning (RL) such as Q-learning (QL), deep Q-learning network (DQN), and deep deterministic policy gradient (DDPG) are employed to solve the optimization problems subject to the minimum quality of service (QoS), minimum effect on the OMA users, imperfect SIC, and RB allocation constraints. The results indicate the efficiency of the proposed methods. | 
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14  | 
| ISSN: | 0018-9545 1939-9359  | 
| DOI: | 10.1109/TVT.2024.3386587 |