Patty: Pattern Series-Based Semantics Analysis for Agnostic Industrial Control Protocols

Reverse engineering of agnostic industrial control protocols (ICPs) based on traffic traces is significant for the security analysis of industrial control systems. Field semantics deduction is an essential step in protocol reverse engineering following the discovery of the message field. Most existi...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on information forensics and security Vol. 20; pp. 5478 - 5491
Main Authors Yang, Daoqing, Yao, Yu, Shan, Yao, Yang, Licheng, Yang, Wei, Liu, Fuyi, Wu, Yunfeng
Format Journal Article
LanguageEnglish
Published IEEE 2025
Subjects
Online AccessGet full text
ISSN1556-6013
1556-6021
DOI10.1109/TIFS.2025.3569129

Cover

More Information
Summary:Reverse engineering of agnostic industrial control protocols (ICPs) based on traffic traces is significant for the security analysis of industrial control systems. Field semantics deduction is an essential step in protocol reverse engineering following the discovery of the message field. Most existing methods rely on knowledge-based analysis for specific fields of common protocols, which require too numerous assumptions and lack semantic knowledge about ICPs. In this paper, we propose a new concept, pattern series, and design the first classification framework for inferring the semantic types of unknown ICPs. Specifically, we first present the definition of pattern series and design the field pattern series generation algorithm for building training data, then develop a field semantics classification model to learn and apply semantic features from known protocols to predict semantic types in unknown protocols. Lastly, we implement a probability-maximizing selection algorithm to obtain optimal semantic types. We demonstrate the effectiveness of the proposed method through extensive experiments with five popular ICPs, including their mixed protocols. Evaluations show that our approach significantly outperforms baseline methods in field semantic recognition, achieving <inline-formula> <tex-math notation="LaTeX">\geq 90.8 </tex-math></inline-formula>% F1-score.
ISSN:1556-6013
1556-6021
DOI:10.1109/TIFS.2025.3569129