PBDT: Python Backdoor Detection Model Based on Combined Features

Application security is essential in today’s highly development period. Backdoor is a means by which attackers can invade the system to achieve illegal purposes and damage users’ rights. It has posed a serious threat to network security. Thus, it is urgent to take adequate measures to defend such at...

Full description

Saved in:
Bibliographic Details
Published inSecurity and communication networks Vol. 2021; pp. 1 - 13
Main Authors Fang, Yong, Xie, Mingyu, Huang, Cheng
Format Journal Article
LanguageEnglish
Published London Hindawi 14.09.2021
John Wiley & Sons, Inc
Subjects
Online AccessGet full text
ISSN1939-0114
1939-0122
1939-0122
DOI10.1155/2021/9923234

Cover

More Information
Summary:Application security is essential in today’s highly development period. Backdoor is a means by which attackers can invade the system to achieve illegal purposes and damage users’ rights. It has posed a serious threat to network security. Thus, it is urgent to take adequate measures to defend such attacks. Previous research work was mainly focused on numerous PHP webshells, with less research on Python backdoor files. Language differences make the method not entirely applicable. This paper proposes a Python backdoor detection model named PBDT based on combined features. The model summarizes the common functional modules and functions in the backdoor files and extracts the number of calls in the text to form sample features. What is more, we consider the text’s statistical characteristics, including the information entropy, the longest string, etc., to identify the obfuscated Python code. Besides, the opcode sequence is used to represent code characteristics, such as TF-IDF vector and FastText classifier, to eliminate the influence of interference items. Finally, we introduce the Random Forest algorithm to build a classifier. Covering most types of backdoors, some samples are obfuscated, the model achieves an accuracy of 97.70%, and the TNR index is as high as 98.66%, showing a good classification performance in Python backdoor detection.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1939-0114
1939-0122
1939-0122
DOI:10.1155/2021/9923234