Deep Learning-Based Malware Detection Using PE Headers

Due to recent advancements in technology, developers of intrusive software are finding more and more sophisticated ways to hide the existence of malicious code in software environments. It becomes difficult to identify viruses in the infected data sent in this way during analysis and detection phase...

Full description

Saved in:

Bibliographic Details
Published in	Information and Software Technologies Vol. 1665; pp. 3 - 18
Main Authors	Nakrošis, Arnas, Lagzdinytė-Budnikė, Ingrida, Paulauskaitė-Tarasevičienė, Agnė, Paulikas, Giedrius, Dapkus, Paulius
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2022 Springer International Publishing
Series	Communications in Computer and Information Science
Subjects	Deep learning Machine learning Malicious software Malware PE header
Online Access	Get full text
ISBN	9783031163012 303116301X
ISSN	1865-0929 1865-0937
DOI	10.1007/978-3-031-16302-9_1

Cover

More Information
Summary:	Due to recent advancements in technology, developers of intrusive software are finding more and more sophisticated ways to hide the existence of malicious code in software environments. It becomes difficult to identify viruses in the infected data sent in this way during analysis and detection phase of malware. For this reason, a significant amount of consideration has been devoted to research and development of methodologies and techniques that can identify miscellaneous malware without compromising the execution environment. In order to propose new methods, researchers are investigating not only the structure of malware detection algorithms, but also the properties that can be extracted from files. Extracted features allow malware to be detected even when virus creation tools change. The authors of this study proposed a data structure consisting of 486 attributes that describe the most important file characteristics. The proposed structure was used to train neural networks to detect viruses. A set of over 400,000 infected and benign files were used to build the data set. Various machine learning algorithms based on unsupervised (k-means, self-organizing maps) and supervised (VGG-16, convolutional neural networks, ResNet) learning were tested. The performed tests were designed to determine the usefulness of the tested algorithms to detect malicious software. Based on the implemented experimental research, the authors created and proposed a neural network architecture consisting of Dense and Dropout layers with L2 regularization that enables the detection of 8 types of malware with 98% accuracy. The great advantage of the article is the research carried out based on a large number of files. The proposed neural network architecture recognizes malware with at least the same accuracy as solutions offered by other authors and can be practically used to protect workstations against malicious files.
ISBN:	9783031163012 303116301X
ISSN:	1865-0929 1865-0937
DOI:	10.1007/978-3-031-16302-9_1