Machine Learning in Cyber Security Analytics using NSL-KDD Dataset

Classification is the procedure to recognize, understand, as well as group ideas and objects into given categories. Classification techniques adopt training data patterns to predict the likelihood that subsequent data will classify into one of the given categories. Classification techniques utilize...

Full description

Saved in:

Bibliographic Details
Published in	Conference on Technologies and Applications of Artificial Intelligence (Online) pp. 260 - 265
Main Authors	Hong, Rui-Fong, Horng, Shih-Cheng, Lin, Shieh-Shing
Format	Conference Proceeding
Language	English
Published	IEEE 01.11.2021
Subjects	Classification Cyber security Decision trees Machine learning Machine learning algorithms NSL-KDD Prediction algorithms Python Shape Software Support vector machines Training data
Online Access	Get full text
ISSN	2376-6824
DOI	10.1109/TAAI54685.2021.00057

Cover

Abstract	Classification is the procedure to recognize, understand, as well as group ideas and objects into given categories. Classification techniques adopt training data patterns to predict the likelihood that subsequent data will classify into one of the given categories. Classification techniques utilize a variety of algorithms to classify future datasets through training data patterns. In current society, many network attacks continue to carry out various types of attacks. This work performs data pre-processing and uses Python with machine learning algorithms to analyze the NSL-KDD data set. We use various machine learning methods, such as decision trees, random forests, Naïve Bayes, KNN, Gradient Boosted Trees, and SVM to analyze the confusion matrix and predict the accuracy. We also draw the ROC curve and the AUC area. We calculate the ACC accuracy and make a simple assessment of the quality of different algorithms. Test results show that through data pre-processing, machine learning algorithms can be performed with extremely high accuracy.
AbstractList	Classification is the procedure to recognize, understand, as well as group ideas and objects into given categories. Classification techniques adopt training data patterns to predict the likelihood that subsequent data will classify into one of the given categories. Classification techniques utilize a variety of algorithms to classify future datasets through training data patterns. In current society, many network attacks continue to carry out various types of attacks. This work performs data pre-processing and uses Python with machine learning algorithms to analyze the NSL-KDD data set. We use various machine learning methods, such as decision trees, random forests, Naïve Bayes, KNN, Gradient Boosted Trees, and SVM to analyze the confusion matrix and predict the accuracy. We also draw the ROC curve and the AUC area. We calculate the ACC accuracy and make a simple assessment of the quality of different algorithms. Test results show that through data pre-processing, machine learning algorithms can be performed with extremely high accuracy.
Author	Horng, Shih-Cheng Hong, Rui-Fong Lin, Shieh-Shing
Author_xml	– sequence: 1 givenname: Rui-Fong surname: Hong fullname: Hong, Rui-Fong email: s10927609@cyut.edu.tw organization: Chaoyang University of Technology,Department of Computer Science & Information Engineering,Taichung,Taiwan, R.O.C – sequence: 2 givenname: Shih-Cheng surname: Horng fullname: Horng, Shih-Cheng email: schong@cyut.edu.tw organization: Chaoyang University of Technology,Department of Computer Science & Information Engineering,Taichung,Taiwan, R.O.C – sequence: 3 givenname: Shieh-Shing surname: Lin fullname: Lin, Shieh-Shing email: sslin@mail.sju.edu.tw organization: St. John's University,Department of Electrical Engineering,Taipei,Taiwan, R.O.C
BookMark	eNotjMlOwzAURQ0Cibb0C2DhH0jw9DwsQ8JQNYVFs6_sxAGjYlCcLvL3BMHqSFfn3CW6iF_RI3RLSU4pMXdNUWxASA05I4zmhBBQZ2hJpQRBNAN6jhaMK5lJzcQVWqf0MTucEUE1XaD7nW3fQ_S49naIIb7hEHE5OT_gvW9PQxgnXER7nMbQJnxKv8bLvs62VYUrO9rkx2t02dtj8ut_rlDz-NCUz1n9-rQpizoLQpPMQGupkQZAtZ1wneQzeusYB6U0da53wNS8MKWEUcA6DtCrnmnDpSSGr9DN323w3h--h_Bph-lg5piA5j_FoUo0
CODEN	IEEPAD
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/TAAI54685.2021.00057
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Xplore digital library IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EISBN	1665408251 9781665408257
EISSN	2376-6824
EndPage	265
ExternalDocumentID	9778058
Genre	orig-research
GrantInformation_xml	– fundername: Ministry of Science and Technology funderid: 10.13039/501100003711
GroupedDBID	6IE 6IF 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK OCL RIE RIL
ID	FETCH-LOGICAL-i480-95ca1969557cd4bd63cd4fab2357781bbfb5274fa27749752d355f7f289366093
IEDL.DBID	RIE
IngestDate	Wed Aug 27 02:35:29 EDT 2025
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i480-95ca1969557cd4bd63cd4fab2357781bbfb5274fa27749752d355f7f289366093
PageCount	6
ParticipantIDs	ieee_primary_9778058
PublicationCentury	2000
PublicationDate	2021-Nov.
PublicationDateYYYYMMDD	2021-11-01
PublicationDate_xml	– month: 11 year: 2021 text: 2021-Nov.
PublicationDecade	2020
PublicationTitle	Conference on Technologies and Applications of Artificial Intelligence (Online)
PublicationTitleAbbrev	TAAI
PublicationYear	2021
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0003204181
Score	1.7775822
Snippet	Classification is the procedure to recognize, understand, as well as group ideas and objects into given categories. Classification techniques adopt training...
SourceID	ieee
SourceType	Publisher
StartPage	260
SubjectTerms	Classification Cyber security Decision trees Machine learning Machine learning algorithms NSL-KDD Prediction algorithms Python Shape Software Support vector machines Training data
Title	Machine Learning in Cyber Security Analytics using NSL-KDD Dataset
URI	https://ieeexplore.ieee.org/document/9778058
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwGG2Qkyc1YPydHjxa6LZ-3XpEkOAPiAmYcCPt2hJiMgwZB_3rbbuJ0XjwtGY9dE0P73vde-9D6BqsTLSMJLGSSsK4ASKEYcSqiMk840LooPKd8NELe5jDvIFudl4YY0wQn5mOH4Z_-Xqdb_1VWdfVKhmFbA_tpRmvvFq7-5QkpsyhVe2Oi6joznq9e2A8A8cC48gHFcLPHioBQoYHaPy1eKUcee1sS9XJP37lMv736w5R-9ush593MHSEGqZoodtxEEkaXOenLvGqwP13ZTZ4WneswyGPxKc0Yy9-X-LJ9Ik8DgZ4IEsHbWUbzYZ3s_6I1O0SyIpllAjIpc-6AUhzzZTmiXtYqXyeTeqKU2UVOApqZewqPpFCrF2pYVPrGFfCORXJMWoW68KcIKzAzRkhYmpzJowUnscK6aie1NQAP0Utv_3FWxWIsah3fvb363O07w-gMvBdoGa52ZpLh-SlugpH-An6QZyH
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwFLRKGWAC1CK-8cCIWyfxc-KxtFQt_RBSg9StshO7qpBSVKUD_HrsJBSBGJhixYNjebh3zt09hO7AyCCVniRGUkkY10CE0IwY5TGZRFyItFD5TvnghT3NYV5D9zsvjNa6EJ_plhsW__LTdbJ1V2VtW6tEFKI9tA-MMSjdWrsblcCnzOJV5Y_zqGjHnc4QGI_A8kDfc1GF8LOLSgEi_SM0-Vq-1I68tra5aiUfv5IZ__t9x6j5bdfDzzsgOkE1nTXQw6SQSWpcJagu8SrD3XelN3hW9azDRSKJy2nGTv6-xNPZmIx6PdyTuQW3vIni_mPcHZCqYQJZsYgSAYl0aTcAYZIylfLAPoxULtEmtOWpMgosCTXStzWfCMFPbbFhQmM5V8A5FcEpqmfrTJ8hrMDOaSF8ahImtBSOyQppyZ5MqQZ-jhpu-4u3MhJjUe384u_Xt-hgEE_Gi_FwOrpEh-4wSjvfFarnm62-trieq5viOD8BYrSf1A
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Conference+on+Technologies+and+Applications+of+Artificial+Intelligence+%28Online%29&rft.atitle=Machine+Learning+in+Cyber+Security+Analytics+using+NSL-KDD+Dataset&rft.au=Hong%2C+Rui-Fong&rft.au=Horng%2C+Shih-Cheng&rft.au=Lin%2C+Shieh-Shing&rft.date=2021-11-01&rft.pub=IEEE&rft.eissn=2376-6824&rft.spage=260&rft.epage=265&rft_id=info:doi/10.1109%2FTAAI54685.2021.00057&rft.externalDocID=9778058