Network defense decision-making based on a stochastic game system and a deep recurrent Q-network

Defense decision-making in cybersecurity has increasingly relied upon stochastic game processes that combine game theory with a Markov decision process (MDP). However, the MDP presumes that both attackers and defenders are perfectly rational and have complete information, which greatly limits the sc...

Full description

Saved in:
Bibliographic Details
Published inComputers & security Vol. 111; p. 102480
Main Authors Liu, Xiaohu, Zhang, Hengwei, Dong, Shuqin, Zhang, Yuchen
Format Journal Article
LanguageEnglish
Published Amsterdam Elsevier Ltd 01.12.2021
Elsevier Sequoia S.A
Subjects
Online AccessGet full text
ISSN0167-4048
1872-6208
DOI10.1016/j.cose.2021.102480

Cover

More Information
Summary:Defense decision-making in cybersecurity has increasingly relied upon stochastic game processes that combine game theory with a Markov decision process (MDP). However, the MDP presumes that both attackers and defenders are perfectly rational and have complete information, which greatly limits the scope of application and guidance value of MDP to the defense decision-making process. The present study addresses this issue by applying a partially observable MDP to analyze attack-defense behaviors, and a deep Q-network (DQN) algorithm based on a recurrent neural network for solving game equilibria dynamically and intelligently under conditions of partial rationality and incomplete information. The proposed DQN method enables network defense strategies to leverage online learning to gradually approach an optimal defense strategy. The rationality and convergence of the proposed approach are demonstrated by conducting simulations and comparative analyses of both the attacking and defending parties engaged in distributed reflection denial of service attacks.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0167-4048
1872-6208
DOI:10.1016/j.cose.2021.102480