Network defense decision-making based on a stochastic game system and a deep recurrent Q-network

Defense decision-making in cybersecurity has increasingly relied upon stochastic game processes that combine game theory with a Markov decision process (MDP). However, the MDP presumes that both attackers and defenders are perfectly rational and have complete information, which greatly limits the sc...

Full description

Saved in:

Bibliographic Details
Published in	Computers & security Vol. 111; p. 102480
Main Authors	Liu, Xiaohu, Zhang, Hengwei, Dong, Shuqin, Zhang, Yuchen
Format	Journal Article
Language	English
Published	Amsterdam Elsevier Ltd 01.12.2021 Elsevier Sequoia S.A
Subjects	Algorithms Cybersecurity Decision making Decision theory Deep recurrent Q-network Defense Defense decision-making Denial of service attacks Distributed reflection denial of service attacks Game theory Machine learning Markov processes Partially observable Markov decision process Rationality Recurrent neural networks Stochastic game Stochastic game Distributed reflection denial of service attacks Partially observable Markov decision process Deep recurrent Q-network Defense decision-making
Online Access	Get full text
ISSN	0167-4048 1872-6208
DOI	10.1016/j.cose.2021.102480

Cover

More Information
Summary:	Defense decision-making in cybersecurity has increasingly relied upon stochastic game processes that combine game theory with a Markov decision process (MDP). However, the MDP presumes that both attackers and defenders are perfectly rational and have complete information, which greatly limits the scope of application and guidance value of MDP to the defense decision-making process. The present study addresses this issue by applying a partially observable MDP to analyze attack-defense behaviors, and a deep Q-network (DQN) algorithm based on a recurrent neural network for solving game equilibria dynamically and intelligently under conditions of partial rationality and incomplete information. The proposed DQN method enables network defense strategies to leverage online learning to gradually approach an optimal defense strategy. The rationality and convergence of the proposed approach are demonstrated by conducting simulations and comparative analyses of both the attacking and defending parties engaged in distributed reflection denial of service attacks.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0167-4048 1872-6208
DOI:	10.1016/j.cose.2021.102480