A High Performance, Low Complexity Algorithm for Multi-Player Bandits Without Collision Sensing Information
Motivated by applications in cognitive radio networks, we consider the decentralized multi-player multi-armed bandit problem, without collision nor sensing information. We propose Randomized Selfish KL-UCB, an algorithm with very low computational complexity, inspired by the Selfish KL-UCB algorithm...
        Saved in:
      
    
          | Main Authors | , | 
|---|---|
| Format | Journal Article | 
| Language | English | 
| Published | 
          
        19.02.2021
     | 
| Subjects | |
| Online Access | Get full text | 
| DOI | 10.48550/arxiv.2102.10200 | 
Cover
| Summary: | Motivated by applications in cognitive radio networks, we consider the
decentralized multi-player multi-armed bandit problem, without collision nor
sensing information. We propose Randomized Selfish KL-UCB, an algorithm with
very low computational complexity, inspired by the Selfish KL-UCB algorithm,
which has been abandoned as it provably performs sub-optimally in some cases.
We subject Randomized Selfish KL-UCB to extensive numerical experiments showing
that it far outperforms state-of-the-art algorithms in almost all environments,
sometimes by several orders of magnitude, and without the additional knowledge
required by state-of-the-art algorithms. We also emphasize the potential of
this algorithm for the more realistic dynamic setting, and support our claims
with further experiments. We believe that the low complexity and high
performance of Randomized Selfish KL-UCB makes it the most suitable for
implementation in practical systems amongst known algorithms. | 
|---|---|
| DOI: | 10.48550/arxiv.2102.10200 |