Sequence-Aware Vision Transformer with Feature Fusion for Fault Diagnosis in Complex Industrial Processes

Industrial fault diagnosis faces unique challenges with high-dimensional data, long time-series, and complex couplings, which are characterized by significant information entropy and intricate information dependencies inherent in datasets. Traditional image processing methods are effective for local...

Full description

Saved in:
Bibliographic Details
Published inEntropy (Basel, Switzerland) Vol. 27; no. 2; p. 181
Main Authors Zhang, Zhong, Xu, Ming, Wang, Song, Guo, Xin, Gao, Jinfeng, Hu, Aiguo Patrick
Format Journal Article
LanguageEnglish
Published Switzerland MDPI AG 08.02.2025
MDPI
Subjects
Online AccessGet full text
ISSN1099-4300
1099-4300
DOI10.3390/e27020181

Cover

More Information
Summary:Industrial fault diagnosis faces unique challenges with high-dimensional data, long time-series, and complex couplings, which are characterized by significant information entropy and intricate information dependencies inherent in datasets. Traditional image processing methods are effective for local feature extraction but often miss global temporal patterns, crucial for accurate diagnosis. While deep learning models like Vision Transformer (ViT) capture broader temporal features, they struggle with varying fault causes and time dependencies inherent in industrial data, where adding encoder layers may even hinder performance. This paper proposes a novel global and local feature fusion sequence-aware ViT (GLF-ViT), modifying feature embedding to retain sampling point correlations and preserve more local information. By fusing global features from the classification token with local features from the encoder, the algorithm significantly enhances complex fault diagnosis. Experimental analyses on data segment length, network depth, feature fusion and attention head receptive field validate the approach, demonstrating that a shallower encoder network is better suited for high-dimensional time-series fault diagnosis in complex industrial processes compared to deeper networks. The proposed method outperforms state-of-the-art algorithms on the Tennessee Eastman (TE) dataset and demonstrates excellent performance when further validated on a power transmission fault dataset.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1099-4300
1099-4300
DOI:10.3390/e27020181