TMU-Net: A Transformer-Based Multimodal Framework with Uncertainty Quantification for Driver Fatigue Detection

Driving fatigued is a prevalent issue frequently contributing to traffic accidents, prompting the development of automated fatigue detection methods based on various data sources, particularly reliable physiological signals. However, challenges in accuracy, robustness, and practicality persist, espe...

Full description

Saved in:
Bibliographic Details
Published inSensors (Basel, Switzerland) Vol. 25; no. 17; p. 5364
Main Authors Zhang, Yaxin, Xu, Xuegang, Du, Yuetao, Zhang, Ningchao
Format Journal Article
LanguageEnglish
Published Switzerland MDPI AG 01.09.2025
Subjects
Online AccessGet full text
ISSN1424-8220
1424-8220
DOI10.3390/s25175364

Cover

More Information
Summary:Driving fatigued is a prevalent issue frequently contributing to traffic accidents, prompting the development of automated fatigue detection methods based on various data sources, particularly reliable physiological signals. However, challenges in accuracy, robustness, and practicality persist, especially for cross-subject detection. Multimodal data fusion can enhance the effective estimation of driver fatigue. In this work, we leverage the advantages of multimodal signals to propose a novel Multimodal Attention Network (TMU-Net) for driver fatigue detection, achieving precise fatigue assessment by integrating electroencephalogram (EEG) and electrooculogram (EOG) signals. The core innovation of TMU-Net lies in its unimodal feature extraction module, which combines causal convolution, ConvSparseAttention, and Transformer encoders to effectively capture spatiotemporal features, and a multimodal fusion module that employs cross-modal attention and uncertainty-weighted gating to dynamically integrate complementary information. By incorporating uncertainty quantification, TMU-Net significantly enhances robustness to noise and individual variability. Experimental validation on the SEED-VIG dataset demonstrates TMU-Net’s superior performance stability across 23 subjects in cross-subject testing, effectively leveraging the complementary strengths of EEG (2 Hz full-band and five-band features) and EOG signals for high-precision fatigue detection. Furthermore, attention heatmap visualization reveals the dynamic interaction mechanisms between EEG and EOG signals, confirming the physiological rationality of TMU-Net’s feature fusion strategy. Practical challenges and future research directions for fatigue detection methods are also discussed.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1424-8220
1424-8220
DOI:10.3390/s25175364