Dual-stage gated segmented multimodal emotion recognition method

Multimodal emotion recognition has broad applications in mental health detection and affective computing. However, most existing methods rely on either global or local features, neglecting the joint modeling of both, which limits emotion recognition performance. To address this, a Transformer-based...

Full description

Saved in:
Bibliographic Details
Published in智能科学与技术学报 Vol. 7; pp. 257 - 267
Main Authors MA Fei, LI Shuzhi, YANG Feixia, XU Guangxian
Format Journal Article
LanguageChinese
Published POSTS&TELECOM PRESS Co., LTD 01.06.2025
Subjects
Online AccessGet full text
ISSN2096-6652

Cover

More Information
Summary:Multimodal emotion recognition has broad applications in mental health detection and affective computing. However, most existing methods rely on either global or local features, neglecting the joint modeling of both, which limits emotion recognition performance. To address this, a Transformer-based dual-stage gated segmented multimodal emotion recognition method (DGM). DGM adopts a segmented fusion architecture was proposed, consisting of an interaction stage and a dual-stage gating stage. In the interaction stage, the OAGL fusion strategy was employed to model global-local cross-modal interactions, improving the efficiency of feature fusion. The dual-stage gating stage integrates local and global features was designed to fully utilize emotional information. Additionally, to resolve the misalignment of local temporal features across modalities, a scaled dot-product-based sequence alignment method was developed to enhance fusion accuracy. Experimental were conducted on three benchmark datasets (CMU-MOSI, CMU-M
ISSN:2096-6652