Self-Supervised ECG Representation Learning for Emotion Recognition

We exploit a self-supervised deep multi-task learning framework for electrocardiogram (ECG) -based emotion recognition. The proposed solution consists of two stages of learning a) learning ECG representations and b) learning to classify emotions. ECG representations are learned by a signal transform...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on affective computing Vol. 13; no. 3; pp. 1541 - 1554
Main Authors	Sarkar, Pritam, Etemad, Ali
Format	Journal Article
Language	English
Published	Piscataway IEEE 01.07.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Affective computing Arousal Datasets ECG Electrocardiography Emotion recognition Emotions Feature extraction Machine learning multi-task learning Representations Self-supervised learning Stress Supervised learning Task analysis Transformations
Online Access	Get full text
ISSN	1949-3045 1949-3045
DOI	10.1109/TAFFC.2020.3014842

Cover

More Information
Summary:	We exploit a self-supervised deep multi-task learning framework for electrocardiogram (ECG) -based emotion recognition. The proposed solution consists of two stages of learning a) learning ECG representations and b) learning to classify emotions. ECG representations are learned by a signal transformation recognition network. The network learns high-level abstract representations from unlabeled ECG data. Six different signal transformations are applied to the ECG signals, and transformation recognition is performed as pretext tasks. Training the model on pretext tasks helps the network learn spatiotemporal representations that generalize well across different datasets and different emotion categories. We transfer the weights of the self-supervised network to an emotion recognition network, where the convolutional layers are kept frozen and the dense layers are trained with labelled ECG data. We show that the proposed solution considerably improves the performance compared to a network trained using fully-supervised learning. New state-of-the-art results are set in classification of arousal, valence, affective states, and stress for the four utilized datasets. Extensive experiments are performed, providing interesting insights into the impact of using a multi-task self-supervised structure instead of a single-task model, as well as the optimum level of difficulty required for the pretext self-supervised tasks.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1949-3045 1949-3045
DOI:	10.1109/TAFFC.2020.3014842