Machine Learning in Cyber Security Analytics using NSL-KDD Dataset

Classification is the procedure to recognize, understand, as well as group ideas and objects into given categories. Classification techniques adopt training data patterns to predict the likelihood that subsequent data will classify into one of the given categories. Classification techniques utilize...

Full description

Saved in:
Bibliographic Details
Published inConference on Technologies and Applications of Artificial Intelligence (Online) pp. 260 - 265
Main Authors Hong, Rui-Fong, Horng, Shih-Cheng, Lin, Shieh-Shing
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.11.2021
Subjects
Online AccessGet full text
ISSN2376-6824
DOI10.1109/TAAI54685.2021.00057

Cover

Abstract Classification is the procedure to recognize, understand, as well as group ideas and objects into given categories. Classification techniques adopt training data patterns to predict the likelihood that subsequent data will classify into one of the given categories. Classification techniques utilize a variety of algorithms to classify future datasets through training data patterns. In current society, many network attacks continue to carry out various types of attacks. This work performs data pre-processing and uses Python with machine learning algorithms to analyze the NSL-KDD data set. We use various machine learning methods, such as decision trees, random forests, Naïve Bayes, KNN, Gradient Boosted Trees, and SVM to analyze the confusion matrix and predict the accuracy. We also draw the ROC curve and the AUC area. We calculate the ACC accuracy and make a simple assessment of the quality of different algorithms. Test results show that through data pre-processing, machine learning algorithms can be performed with extremely high accuracy.
AbstractList Classification is the procedure to recognize, understand, as well as group ideas and objects into given categories. Classification techniques adopt training data patterns to predict the likelihood that subsequent data will classify into one of the given categories. Classification techniques utilize a variety of algorithms to classify future datasets through training data patterns. In current society, many network attacks continue to carry out various types of attacks. This work performs data pre-processing and uses Python with machine learning algorithms to analyze the NSL-KDD data set. We use various machine learning methods, such as decision trees, random forests, Naïve Bayes, KNN, Gradient Boosted Trees, and SVM to analyze the confusion matrix and predict the accuracy. We also draw the ROC curve and the AUC area. We calculate the ACC accuracy and make a simple assessment of the quality of different algorithms. Test results show that through data pre-processing, machine learning algorithms can be performed with extremely high accuracy.
Author Horng, Shih-Cheng
Hong, Rui-Fong
Lin, Shieh-Shing
Author_xml – sequence: 1
  givenname: Rui-Fong
  surname: Hong
  fullname: Hong, Rui-Fong
  email: s10927609@cyut.edu.tw
  organization: Chaoyang University of Technology,Department of Computer Science & Information Engineering,Taichung,Taiwan, R.O.C
– sequence: 2
  givenname: Shih-Cheng
  surname: Horng
  fullname: Horng, Shih-Cheng
  email: schong@cyut.edu.tw
  organization: Chaoyang University of Technology,Department of Computer Science & Information Engineering,Taichung,Taiwan, R.O.C
– sequence: 3
  givenname: Shieh-Shing
  surname: Lin
  fullname: Lin, Shieh-Shing
  email: sslin@mail.sju.edu.tw
  organization: St. John's University,Department of Electrical Engineering,Taipei,Taiwan, R.O.C
BookMark eNotjMlOwzAURQ0Cibb0C2DhH0jw9DwsQ8JQNYVFs6_sxAGjYlCcLvL3BMHqSFfn3CW6iF_RI3RLSU4pMXdNUWxASA05I4zmhBBQZ2hJpQRBNAN6jhaMK5lJzcQVWqf0MTucEUE1XaD7nW3fQ_S49naIIb7hEHE5OT_gvW9PQxgnXER7nMbQJnxKv8bLvs62VYUrO9rkx2t02dtj8ut_rlDz-NCUz1n9-rQpizoLQpPMQGupkQZAtZ1wneQzeusYB6U0da53wNS8MKWEUcA6DtCrnmnDpSSGr9DN323w3h--h_Bph-lg5piA5j_FoUo0
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/TAAI54685.2021.00057
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Xplore digital library
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 1665408251
9781665408257
EISSN 2376-6824
EndPage 265
ExternalDocumentID 9778058
Genre orig-research
GrantInformation_xml – fundername: Ministry of Science and Technology
  funderid: 10.13039/501100003711
GroupedDBID 6IE
6IF
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
OCL
RIE
RIL
ID FETCH-LOGICAL-i480-95ca1969557cd4bd63cd4fab2357781bbfb5274fa27749752d355f7f289366093
IEDL.DBID RIE
IngestDate Wed Aug 27 02:35:29 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i480-95ca1969557cd4bd63cd4fab2357781bbfb5274fa27749752d355f7f289366093
PageCount 6
ParticipantIDs ieee_primary_9778058
PublicationCentury 2000
PublicationDate 2021-Nov.
PublicationDateYYYYMMDD 2021-11-01
PublicationDate_xml – month: 11
  year: 2021
  text: 2021-Nov.
PublicationDecade 2020
PublicationTitle Conference on Technologies and Applications of Artificial Intelligence (Online)
PublicationTitleAbbrev TAAI
PublicationYear 2021
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0003204181
Score 1.7775822
Snippet Classification is the procedure to recognize, understand, as well as group ideas and objects into given categories. Classification techniques adopt training...
SourceID ieee
SourceType Publisher
StartPage 260
SubjectTerms Classification
Cyber security
Decision trees
Machine learning
Machine learning algorithms
NSL-KDD
Prediction algorithms
Python
Shape
Software
Support vector machines
Training data
Title Machine Learning in Cyber Security Analytics using NSL-KDD Dataset
URI https://ieeexplore.ieee.org/document/9778058
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwGG2Qkyc1YPydHjxa6LZ-3XpEkOAPiAmYcCPt2hJiMgwZB_3rbbuJ0XjwtGY9dE0P73vde-9D6BqsTLSMJLGSSsK4ASKEYcSqiMk840LooPKd8NELe5jDvIFudl4YY0wQn5mOH4Z_-Xqdb_1VWdfVKhmFbA_tpRmvvFq7-5QkpsyhVe2Oi6joznq9e2A8A8cC48gHFcLPHioBQoYHaPy1eKUcee1sS9XJP37lMv736w5R-9ush593MHSEGqZoodtxEEkaXOenLvGqwP13ZTZ4WneswyGPxKc0Yy9-X-LJ9Ik8DgZ4IEsHbWUbzYZ3s_6I1O0SyIpllAjIpc-6AUhzzZTmiXtYqXyeTeqKU2UVOApqZewqPpFCrF2pYVPrGFfCORXJMWoW68KcIKzAzRkhYmpzJowUnscK6aie1NQAP0Utv_3FWxWIsah3fvb363O07w-gMvBdoGa52ZpLh-SlugpH-An6QZyH
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwFLRKGWAC1CK-8cCIWyfxc-KxtFQt_RBSg9StshO7qpBSVKUD_HrsJBSBGJhixYNjebh3zt09hO7AyCCVniRGUkkY10CE0IwY5TGZRFyItFD5TvnghT3NYV5D9zsvjNa6EJ_plhsW__LTdbJ1V2VtW6tEFKI9tA-MMSjdWrsblcCnzOJV5Y_zqGjHnc4QGI_A8kDfc1GF8LOLSgEi_SM0-Vq-1I68tra5aiUfv5IZ__t9x6j5bdfDzzsgOkE1nTXQw6SQSWpcJagu8SrD3XelN3hW9azDRSKJy2nGTv6-xNPZmIx6PdyTuQW3vIni_mPcHZCqYQJZsYgSAYl0aTcAYZIylfLAPoxULtEmtOWpMgosCTXStzWfCMFPbbFhQmM5V8A5FcEpqmfrTJ8hrMDOaSF8ahImtBSOyQppyZ5MqQZ-jhpu-4u3MhJjUe384u_Xt-hgEE_Gi_FwOrpEh-4wSjvfFarnm62-trieq5viOD8BYrSf1A
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Conference+on+Technologies+and+Applications+of+Artificial+Intelligence+%28Online%29&rft.atitle=Machine+Learning+in+Cyber+Security+Analytics+using+NSL-KDD+Dataset&rft.au=Hong%2C+Rui-Fong&rft.au=Horng%2C+Shih-Cheng&rft.au=Lin%2C+Shieh-Shing&rft.date=2021-11-01&rft.pub=IEEE&rft.eissn=2376-6824&rft.spage=260&rft.epage=265&rft_id=info:doi/10.1109%2FTAAI54685.2021.00057&rft.externalDocID=9778058