An Operating System Identification Method Based on Active Learning

In operating system (OS) identification, machine learning algorithms are widely adopted, which can receive reasonable accuracy even under encrypted traffic. However, machine learning algorithms require large amounts of labeled data for train. In addition, such algorithms have difficulty dealing with...

Full description

Saved in:
Bibliographic Details
Published in2022 International Conference on Electrical, Computer and Energy Technologies (ICECET) pp. 1 - 6
Main Authors Zhang, Daowei, Wang, Qiujie, Wei, Ziling, Chen, Shuhui
Format Conference Proceeding
LanguageEnglish
Published IEEE 20.07.2022
Subjects
Online AccessGet full text
DOI10.1109/ICECET55527.2022.9873443

Cover

More Information
Summary:In operating system (OS) identification, machine learning algorithms are widely adopted, which can receive reasonable accuracy even under encrypted traffic. However, machine learning algorithms require large amounts of labeled data for train. In addition, such algorithms have difficulty dealing with data imbalances and predicting the types of OSes that account for a small percentage of traffic. To solve the above challenges, we propose an OS identification algorithm based on active learning (AL) in this paper for the first time. In the algorithm, a query strategy is designed. We test the performance of the proposed algorithm using an unbalanced dataset. The results show that the proposed algorithm can achieve similar performance only using 0.4% of labeled data for training compared with the existing machine learning algorithms. Compared with existing algorithms, the proposed AL-based algorithm only needs 32% of the training time and 3.2% of the training samples to achieve the same accuracy under the full data set. In addition, it also performs better on multi-classification problems than existing algorithms.
DOI:10.1109/ICECET55527.2022.9873443