Dual-stage gated segmented multimodal emotion recognition method

Multimodal emotion recognition has broad applications in mental health detection and affective computing. However, most existing methods rely on either global or local features, neglecting the joint modeling of both, which limits emotion recognition performance. To address this, a Transformer-based...

Full description

Saved in:

Bibliographic Details
Published in	智能科学与技术学报 Vol. 7; pp. 257 - 267
Main Authors	MA Fei, LI Shuzhi, YANG Feixia, XU Guangxian
Format	Journal Article
Language	Chinese
Published	POSTS&TELECOM PRESS Co., LTD 01.06.2025
Subjects	Dual-Stage Gated Fusion Multimodal Emotion Recognition Scaled Dot-Product Attention Transformer
Online Access	Get full text
ISSN	2096-6652

Cover

More Information
Summary:	Multimodal emotion recognition has broad applications in mental health detection and affective computing. However, most existing methods rely on either global or local features, neglecting the joint modeling of both, which limits emotion recognition performance. To address this, a Transformer-based dual-stage gated segmented multimodal emotion recognition method (DGM). DGM adopts a segmented fusion architecture was proposed, consisting of an interaction stage and a dual-stage gating stage. In the interaction stage, the OAGL fusion strategy was employed to model global-local cross-modal interactions, improving the efficiency of feature fusion. The dual-stage gating stage integrates local and global features was designed to fully utilize emotional information. Additionally, to resolve the misalignment of local temporal features across modalities, a scaled dot-product-based sequence alignment method was developed to enhance fusion accuracy. Experimental were conducted on three benchmark datasets (CMU-MOSI, CMU-M
ISSN:	2096-6652