Unsupervised Clustering for a Comparative Methodology of Machine Learning Models to Detect Domain-Generated Algorithms Based on an Alphanumeric Features Analysis

Domain Generation Algorithms (DGAs) are often used for generating huge amounts of domain names to maintain command and control between the infected computer and the bot master. By establishing as needed a great number of domain names, attackers may mask their C2 servers and escape detection. Many ma...

Full description

Saved in:
Bibliographic Details
Published inJournal of network and systems management Vol. 32; no. 1; p. 18
Main Authors Hassaoui, Mohamed, Hanini, Mohamed, El Kafhali, Said
Format Journal Article
LanguageEnglish
Published New York Springer US 01.03.2024
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN1064-7570
1573-7705
DOI10.1007/s10922-023-09793-6

Cover

More Information
Summary:Domain Generation Algorithms (DGAs) are often used for generating huge amounts of domain names to maintain command and control between the infected computer and the bot master. By establishing as needed a great number of domain names, attackers may mask their C2 servers and escape detection. Many malware families have switched to a stealthier contact approach. Therefore, the traditional methods become ineffective. Over the past decades, many researches have started to use artificial intelligence to create systems able to detect DGA in traffic, but these works do not use the same data to evaluate their models. This article proposes a comparative methodology to compare machine learning models based on unsupervised clustering and then applied this methodology to study the best models belonging to neural network methods and traditional machine learning methods to detect DGAs. We extracted 21 linguistic features based on the analysis of alphanumeric and n-gram, we studied the correlation between these features in order to reduce their number. We examine in detail those Machine learning algorithms and we discuss the drawbacks and strengths of each method with specific classes of DGA to propose a new switch case model that could be always reliable to detect DGAs.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1064-7570
1573-7705
DOI:10.1007/s10922-023-09793-6