Weight Adjustment Scheme Based on Hop Count in Q-routing for Software Defined Networks-enabled Wireless Sensor Networks

The reinforcement learning algorithm has proven its potential in solving sequential decision-making problems under uncertainties, such as finding paths to route data packets in wireless sensor networks. With reinforcement learning, the computation of the optimum path requires careful definition of t...

Full description

Saved in:
Bibliographic Details
Published inJournal of Information and Communication Convergence Engineering, 20(1) Vol. 20; no. 1; pp. 22 - 30
Main Authors Daniel Godfrey, Jinsoo Jang, Ki-Il Kim
Format Journal Article
LanguageEnglish
Published 한국정보통신학회JICCE 01.03.2022
한국정보통신학회
Subjects
Online AccessGet full text
ISSN2234-8255
2234-8883
DOI10.6109/jicce.2022.20.1.22

Cover

More Information
Summary:The reinforcement learning algorithm has proven its potential in solving sequential decision-making problems under uncertainties, such as finding paths to route data packets in wireless sensor networks. With reinforcement learning, the computation of the optimum path requires careful definition of the so-called reward function, which is defined as a linear function that aggregates multiple objective functions into a single objective to compute a numerical value (reward) to be maximized. In a typical defined linear reward function, the multiple objectives to be optimized are integrated in the form of a weighted sum with fixed weighting factors for all learning agents. This study proposes a reinforcement learning -based routing protocol for wireless sensor network, where different learning agents prioritize different objective goals by assigning weighting factors to the aggregated objectives of the reward function. We assign appropriate weighting factors to the objectives in the reward function of a sensor node according to its hop-count distance to the sink node. We expect this approach to enhance the effectiveness of multi-objective reinforcement learning for wireless sensor networks with a balanced trade-off among competing parameters. Furthermore, we propose SDN (Software Defined Networks) architecture with multiple controllers for constant network monitoring to allow learning agents to adapt according to the dynamics of the network conditions. Simulation results show that our proposed scheme enhances the performance of wireless sensor network under varied conditions, such as the node density and traffic intensity, with a good trade-off among competing performance metrics. KCI Citation Count: 0
Bibliography:http://www.jicce.org/
ISSN:2234-8255
2234-8883
DOI:10.6109/jicce.2022.20.1.22