Cost-Aware Feature Selection for IoT Device Classification

The classification of Internet-of-Things (IoT) devices into different types is of paramount importance, from multiple perspectives, including security and privacy aspects. Recent works have explored machine learning techniques for fingerprinting (or classifying) IoT devices, with promising results....

Full description

Saved in:
Bibliographic Details
Published inIEEE internet of things journal Vol. 8; no. 14; pp. 11052 - 11064
Main Authors Chakraborty, Biswadeep, Divakaran, Dinil Mon, Nevat, Ido, Peters, Gareth W., Gurusamy, Mohan
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 15.07.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN2327-4662
2327-4662
DOI10.1109/JIOT.2021.3051480

Cover

More Information
Summary:The classification of Internet-of-Things (IoT) devices into different types is of paramount importance, from multiple perspectives, including security and privacy aspects. Recent works have explored machine learning techniques for fingerprinting (or classifying) IoT devices, with promising results. However, the existing works have assumed that the features used for building the machine learning models are readily available or can be easily extracted from the network traffic; in other words, they do not consider the costs associated with feature extraction. In this work, we take a more realistic approach, and argue that feature extraction has a cost, and the costs are different for different features. We also take a step forward from the current practice of considering the misclassification loss as a binary value, and make a case for different losses based on the misclassification performance. Thereby, and more importantly, we introduce the notion of risk for IoT device classification. We define and formulate the problem of cost-aware IoT device classification. This being a combinatorial optimization problem, we develop a novel algorithm to solve it in a fast and effective way using the cross-entropy (CE)-based stochastic optimization technique. Using traffic of real devices, we demonstrate the capability of the CE-based algorithm in selecting features with minimal risk of misclassification while keeping the cost for feature extraction within a specified limit.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2327-4662
2327-4662
DOI:10.1109/JIOT.2021.3051480