Multi-modal Emotion Recognition Using Canonical Correlations and Acoustic Features

The information of the psycho-physical state of the subject is becoming a valuable addition to the modern audio or video recognition systems. As well as enabling a better user experience, it can also assist in superior recognition accuracy of the base system. In the article, we present our approach...

Full description

Saved in:

Bibliographic Details
Published in	2010 20th International Conference on Pattern Recognition pp. 4133 - 4136
Main Authors	Gajsek, Rok, Struc, Vitomir, Mihelic, France
Format	Conference Proceeding
Language	English
Published	IEEE 01.08.2010
Subjects	Correlation Emotion recognition Face Feature extraction multi-modal emotions Support vector machines Video sequences
Online Access	Get full text
ISBN	1424475422 9781424475421
ISSN	1051-4651
DOI	10.1109/ICPR.2010.1005

Cover

More Information
Summary:	The information of the psycho-physical state of the subject is becoming a valuable addition to the modern audio or video recognition systems. As well as enabling a better user experience, it can also assist in superior recognition accuracy of the base system. In the article, we present our approach to multi-modal (audio-video) emotion recognition system. For audio sub-system, a feature set comprised of prosodic, spectral and cepstrum features is selected and support vector classifier is used to produce the scores for each emotional category. For video sub-system a novel approach is presented, which does not rely on the tracking of specific facial landmarks and thus, eliminates the problems usually caused, if the tracking algorithm fails at detecting the correct area. The system is evaluated on the eNTERFACE database and the recognition accuracy of our audio-video fusion is compared to the published results in the literature.
ISBN:	1424475422 9781424475421
ISSN:	1051-4651
DOI:	10.1109/ICPR.2010.1005