Static Malware Analysis Using Machine Learning Algorithms on APT1 Dataset with String and PE Header Features

Static malware analysis is used to analyze executable files without executing the code to determine whether a file is malicious or not. Data analytic and machine learning techniques have been used increasingly to help process the large number of malware files circulating in the wild and detect new a...

Full description

Saved in:
Bibliographic Details
Published in2019 International Conference on Computational Science and Computational Intelligence (CSCI) pp. 90 - 95
Main Authors Balram, Neil, Hsieh, George, McFall, Christian
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.12.2019
Subjects
Online AccessGet full text
DOI10.1109/CSCI49370.2019.00022

Cover

Abstract Static malware analysis is used to analyze executable files without executing the code to determine whether a file is malicious or not. Data analytic and machine learning techniques have been used increasingly to help process the large number of malware files circulating in the wild and detect new attacks. In this paper, we present the design and implementation of six different machine learning classifiers, and two distinct categories of features statically extracted from the executables: strings and Portable Executable header information. A total of twelve malware detectors were implemented for each of the six classifiers to operate with each of the two feature categories separately. These classifiers and feature extraction algorithms were implemented in Python using the scikit-learn machine learning library. The performances in detection accuracy and required processing time of the twelve malware detectors were compared and analyzed.
AbstractList Static malware analysis is used to analyze executable files without executing the code to determine whether a file is malicious or not. Data analytic and machine learning techniques have been used increasingly to help process the large number of malware files circulating in the wild and detect new attacks. In this paper, we present the design and implementation of six different machine learning classifiers, and two distinct categories of features statically extracted from the executables: strings and Portable Executable header information. A total of twelve malware detectors were implemented for each of the six classifiers to operate with each of the two feature categories separately. These classifiers and feature extraction algorithms were implemented in Python using the scikit-learn machine learning library. The performances in detection accuracy and required processing time of the twelve malware detectors were compared and analyzed.
Author Balram, Neil
Hsieh, George
McFall, Christian
Author_xml – sequence: 1
  givenname: Neil
  surname: Balram
  fullname: Balram, Neil
  organization: Norfolk State University, USA
– sequence: 2
  givenname: George
  surname: Hsieh
  fullname: Hsieh, George
  organization: Norfolk State University, USA
– sequence: 3
  givenname: Christian
  surname: McFall
  fullname: McFall, Christian
  organization: Norfolk State University, USA
BookMark eNotzF9rwjAUBfAMtofp9gm2h3yBupukf5LH0ukUHBPqnuW2vdVAjSPJEL_9Ku7hcODH4UzYvTs5YuxVwEwIMG9VXa1SowqYSRBmBgBS3rGJKKQWWaZT9ciGOmK0Lf_E4YyeeOlwuAQb-Hewbj9ye7CO-JrQuyuUw_7kbTwcAz85Xm62gr9jxECRn0fmdfTXGbqOb-Z8SdiR5wvC-OspPLGHHodAz_89ZfVivq2WyfrrY1WV68RKUDHRoiehJYGBFmWRpk2WjyHRmVYrSV3eoxoBRK87yCUWGeUaVCYbIxo1ZS-3V0tEux9vj-gvOwOFgDRTf9iGVCs
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/CSCI49370.2019.00022
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Forestry
EISBN 1728155843
9781728155845
EndPage 95
ExternalDocumentID 9071045
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i203t-81fe182e090ca2744b564b5e1d9c832ed6fa34b501f8d062a75e680352b91b3
IEDL.DBID RIE
IngestDate Thu Jun 29 18:38:57 EDT 2023
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i203t-81fe182e090ca2744b564b5e1d9c832ed6fa34b501f8d062a75e680352b91b3
PageCount 6
ParticipantIDs ieee_primary_9071045
PublicationCentury 2000
PublicationDate 2019-Dec
PublicationDateYYYYMMDD 2019-12-01
PublicationDate_xml – month: 12
  year: 2019
  text: 2019-Dec
PublicationDecade 2010
PublicationTitle 2019 International Conference on Computational Science and Computational Intelligence (CSCI)
PublicationTitleAbbrev CSCI
PublicationYear 2019
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.8144997
Snippet Static malware analysis is used to analyze executable files without executing the code to determine whether a file is malicious or not. Data analytic and...
SourceID ieee
SourceType Publisher
StartPage 90
SubjectTerms APT1 dataset
Data mining
Detectors
Feature extraction
Forestry
Logistics
Machine learning
Malware
malware analysis
Title Static Malware Analysis Using Machine Learning Algorithms on APT1 Dataset with String and PE Header Features
URI https://ieeexplore.ieee.org/document/9071045
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LawIxEA7qofTUh5a-yaHHrib7cnMUq9iCRdCCN0mys1Zqd4vNUuivb2Z3taX00MNCmMMmJFm-mdlvviHkxnWVFjGPHGnBzvHtHXJUyBJHWW8BfKEDlWCgOH4MR0_-wzyY18jtrhYGAAryGbRxWPzLjzOdY6qsIxAP_aBO6t0oLGu1qmo4zkSnP-3f-xZtGRK2UIWSYUfcHz1TCsgYHpDxdrKSKfLSzo1q689fOoz_Xc0haX0X59HJDnaOSA3SY7KHLTaxb1uTrNF_XGk6lusPuQG6lR2hBTvAmpE9CbQSVl3S3nqZbVbm-fWdZintTWac3kljwc1QTNLSqcHUH5VpTCcDOiqozxQ9x9zO2CLT4WDWHzlVTwVn5TLPOBFPwIYUwATTEtUBVRDaB3gstP24IQ4T6VkD40kUs9CV3QDCCEVTleDKOyGNNEvhlFARcT_m2osD-5IIQpnYyAtAWCT0fF92z0gTt2zxVopmLKrdOv_bfEH28dBKnsglaZhNDlcW7Y26Lo75C0hKqoA
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NT8IwFG8QE_XkBxq_7cGjg5a1Yz0ShIACIQETbqTd3pCIm8EtJv719o0PjfHgYUnzDmvTdvm99_Z7v0fIbbVqAhVy39EW7Bxh75BjPBY5xnoLIFQgTYSBYq_vtZ_Ew1iOC-RuUwsDADn5DMo4zP_lh0mQYaqsohAPhdwi21IIIZfVWqt6OM5UpTFsdITFW4aULdShZNgT90fXlBw0Wvukt55uyRV5KWepKQefv5QY_7ueA3L8XZ5HBxvgOSQFiI_IDjbZxM5tJTJHD3IW0J6ef-gF0LXwCM35AdaM_EmgK2nVKa3Pp8lilj6_vtMkpvXBiNN7nVp4SymmaekwxeQf1XFIB03azsnPFH3HzM54TIat5qjRdlZdFZxZlbmp4_MIbFABTLFAoz6gkZ59gIcqsJ83hF6kXWtgPPJD5lV1TYLno2yqUdy4J6QYJzGcEqp8LkIeuKG0L_HB05GNvQCUxUJXCF07IyXcssnbUjZjstqt87_NN2S3Pep1J91O__GC7OEBLlkjl6SYLjK4stifmuv8yL8AZEmtzQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2019+International+Conference+on+Computational+Science+and+Computational+Intelligence+%28CSCI%29&rft.atitle=Static+Malware+Analysis+Using+Machine+Learning+Algorithms+on+APT1+Dataset+with+String+and+PE+Header+Features&rft.au=Balram%2C+Neil&rft.au=Hsieh%2C+George&rft.au=McFall%2C+Christian&rft.date=2019-12-01&rft.pub=IEEE&rft.spage=90&rft.epage=95&rft_id=info:doi/10.1109%2FCSCI49370.2019.00022&rft.externalDocID=9071045