Intrusion detection system combined enhanced random forest with SMOTE algorithm

Network security is subject to malicious attacks from multiple sources, and intrusion detection systems play a key role in maintaining network security. During the training of intrusion detection models, the detection results generally have relatively large false detection rates due to the shortage...

Full description

Saved in:

Bibliographic Details
Published in	EURASIP journal on advances in signal processing Vol. 2022; no. 1; pp. 1 - 20
Main Authors	Wu, Tao, Fan, Honghui, Zhu, Hongjin, You, Congzhe, Zhou, Hongyan, Huang, Xianzhen
Format	Journal Article
Language	English
Published	Cham Springer International Publishing 07.05.2022 Springer Springer Nature B.V SpringerOpen
Subjects	Algorithms Classification Cluster analysis Clustering Data imbalance Datasets Detectors Engineering Enhanced random forest Intrusion detection systems Network intrusion detection NSL-KDD Oversampling Quantum Information Technology Security Security software Signal,Image and Speech Processing Similarity SMOTE algorithm Spintronics Training Vector quantization SMOTE algorithm NSL-KDD Similarity Network intrusion detection Data imbalance Enhanced random forest
Online Access	Get full text
ISSN	1687-6180 1687-6172 1687-6180
DOI	10.1186/s13634-022-00871-6

Cover

More Information
Summary:	Network security is subject to malicious attacks from multiple sources, and intrusion detection systems play a key role in maintaining network security. During the training of intrusion detection models, the detection results generally have relatively large false detection rates due to the shortage of training data caused by data imbalance. To address the existing sample imbalance problem, this paper proposes a network intrusion detection algorithm based on the enhanced random forest and synthetic minority oversampling technique (SMOTE) algorithm. First, the method used a hybrid algorithm combining the K-means clustering algorithm with the SMOTE sampling algorithm to increase the number of minor samples and thus achieved a balanced dataset, by which the sample features of minor samples could be learned more effectively. Second, preliminary prediction results were obtained by using enhanced random forest, and then the similarity matrix of network attacks was used to correct the prediction results of voting processing by analyzing the type of network attacks. In this paper, the performance was tested using the NSL-KDD dataset with a classification accuracy of 99.72% on the training set and 78.47% on the test set. Compared with other related papers, our method has some improvement in the classification accuracy of detection.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1687-6180 1687-6172 1687-6180
DOI:	10.1186/s13634-022-00871-6