Double Deep Q-Network for Power Allocation in Cloud Radio Access Network
Cloud radio access network (CRAN) facilitates resource allocation (RA) by isolating remote radio heads (RRHs) from baseband units (BBUs). Traditional RA algorithms save energy by dynamically turning on/off RRHs and allocating power in each time slot. However, when the energy switching cost is consid...
        Saved in:
      
    
          | Published in | 2020 IEEE 3rd International Conference on Computer and Communication Engineering Technology (CCET) pp. 272 - 277 | 
|---|---|
| Main Authors | , , | 
| Format | Conference Proceeding | 
| Language | English | 
| Published | 
            IEEE
    
        01.08.2020
     | 
| Subjects | |
| Online Access | Get full text | 
| DOI | 10.1109/CCET50901.2020.9213138 | 
Cover
| Summary: | Cloud radio access network (CRAN) facilitates resource allocation (RA) by isolating remote radio heads (RRHs) from baseband units (BBUs). Traditional RA algorithms save energy by dynamically turning on/off RRHs and allocating power in each time slot. However, when the energy switching cost is considered, the decisions of turning on/off RRHs in adjacent time slots are correlated, which cannot be solved directly. Fortunately, deep reinforcement learning (DRL) can effectively model such problem, which motivates us to minimize the total power consumption subject to the constraints on per-RRH transmit power and user rates. Our starting point is the deep Q network (DQN), which is a combination of a neural network and Q-learning. In each time slot, DQN turns on /off a RRH yielding the largest Q-value (known as action value) prior to solving a power minimization problem for active RRHs. However, DQN yields Q-value overestimation issue, which stems from using the same network to choose the best action and to compute the target Qvalue of taking that action at the next state. To further increase the CRAN power savings, we propose a Double DQN-based framework by decoupling the action selection from the target Q-value generation. Simulation results indicate that the Double DQN-based RA method outperforms the DQN-based RA algorithm in terms of total power consumption. | 
|---|---|
| DOI: | 10.1109/CCET50901.2020.9213138 |