Multi-modal Emotion Recognition Using Canonical Correlations and Acoustic Features
The information of the psycho-physical state of the subject is becoming a valuable addition to the modern audio or video recognition systems. As well as enabling a better user experience, it can also assist in superior recognition accuracy of the base system. In the article, we present our approach...
Saved in:
Published in | 2010 20th International Conference on Pattern Recognition pp. 4133 - 4136 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.08.2010
|
Subjects | |
Online Access | Get full text |
ISBN | 1424475422 9781424475421 |
ISSN | 1051-4651 |
DOI | 10.1109/ICPR.2010.1005 |
Cover
Abstract | The information of the psycho-physical state of the subject is becoming a valuable addition to the modern audio or video recognition systems. As well as enabling a better user experience, it can also assist in superior recognition accuracy of the base system. In the article, we present our approach to multi-modal (audio-video) emotion recognition system. For audio sub-system, a feature set comprised of prosodic, spectral and cepstrum features is selected and support vector classifier is used to produce the scores for each emotional category. For video sub-system a novel approach is presented, which does not rely on the tracking of specific facial landmarks and thus, eliminates the problems usually caused, if the tracking algorithm fails at detecting the correct area. The system is evaluated on the eNTERFACE database and the recognition accuracy of our audio-video fusion is compared to the published results in the literature. |
---|---|
AbstractList | The information of the psycho-physical state of the subject is becoming a valuable addition to the modern audio or video recognition systems. As well as enabling a better user experience, it can also assist in superior recognition accuracy of the base system. In the article, we present our approach to multi-modal (audio-video) emotion recognition system. For audio sub-system, a feature set comprised of prosodic, spectral and cepstrum features is selected and support vector classifier is used to produce the scores for each emotional category. For video sub-system a novel approach is presented, which does not rely on the tracking of specific facial landmarks and thus, eliminates the problems usually caused, if the tracking algorithm fails at detecting the correct area. The system is evaluated on the eNTERFACE database and the recognition accuracy of our audio-video fusion is compared to the published results in the literature. |
Author | Struc, Vitomir Gajsek, Rok Mihelic, France |
Author_xml | – sequence: 1 givenname: Rok surname: Gajsek fullname: Gajsek, Rok email: rok.gajsek@fe.uni-lj.si organization: Fac. of Electr. Eng., Univ. of Ljubljana, Ljubljana, Slovenia – sequence: 2 givenname: Vitomir surname: Struc fullname: Struc, Vitomir – sequence: 3 givenname: France surname: Mihelic fullname: Mihelic, France email: france.mihelic@fe.uni-lj.si organization: Fac. of Electr. Eng., Univ. of Ljubljana, Ljubljana, Slovenia |
BookMark | eNo1jk1Lw0AYhFesYFtz9eIlfyD13c1-Hktoa6GihHoumzebspBkJZsc_Pe2fpxmHmYYZkFmfegdIY8UVpSCed4X7-WKwRUBxA1JjNKUM86V4JTfksU_MDYjcwqCZlwKek-SGH0FTCqphBBzUr5O7eizLtS2TTddGH3o09JhOPf-x39E35_Twl4OeLx0ijAMrrXXLKa2r9M1himOHtOts-M0uPhA7hrbRpf86ZIct5tj8ZId3nb7Yn3IvIExy8FKbqShgIgWdO1ypkwDVV0hCqSc1kIx0DYHIRWTqI2SstGOUasQdb4kT7-z3jl3-hx8Z4evkxBGqZzl34XPU7o |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/ICPR.2010.1005 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library (IEL) (UW System Shared) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Computer Science |
EISBN | 9781424475414 9780769541099 1424475414 0769541097 |
EndPage | 4136 |
ExternalDocumentID | 5597732 |
Genre | orig-research |
GroupedDBID | 29J 6IE 6IF 6IK 6IL 6IN AAJGR AAWTH ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI OCL RIE RIL RNS |
ID | FETCH-LOGICAL-i90t-30a6496910ccca08de3279f0bdbcc5c141d57208a3056726c89766f8e21a7cc83 |
IEDL.DBID | RIE |
ISBN | 1424475422 9781424475421 |
ISSN | 1051-4651 |
IngestDate | Wed Aug 27 02:38:12 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i90t-30a6496910ccca08de3279f0bdbcc5c141d57208a3056726c89766f8e21a7cc83 |
PageCount | 4 |
ParticipantIDs | ieee_primary_5597732 |
PublicationCentury | 2000 |
PublicationDate | 2010-Aug. |
PublicationDateYYYYMMDD | 2010-08-01 |
PublicationDate_xml | – month: 08 year: 2010 text: 2010-Aug. |
PublicationDecade | 2010 |
PublicationTitle | 2010 20th International Conference on Pattern Recognition |
PublicationTitleAbbrev | ICPR |
PublicationYear | 2010 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssib026767555 ssj0020358 ssj0000452726 |
Score | 1.9013704 |
Snippet | The information of the psycho-physical state of the subject is becoming a valuable addition to the modern audio or video recognition systems. As well as... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 4133 |
SubjectTerms | Correlation Emotion recognition Face Feature extraction multi-modal emotions Support vector machines Video sequences |
Title | Multi-modal Emotion Recognition Using Canonical Correlations and Acoustic Features |
URI | https://ieeexplore.ieee.org/document/5597732 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3LT8MgGCdzJ71Mtxnf4eDRblBaCkfTbJkmM8syk90WyiMx6mpMe_GvFyidi_HgjfZE4QO-r_weANwySamQCEU6KeIoMVhGLC4cZgoZbApaEOTYyPMnOntOHtfpugPudlwYrbUHn-mRa_q7fFXK2v0qG3uxNGI33IMs4w1Xq42d2AuPBY6l34WTNM5cahGKL0TShhaX2pqJprgleTkH2LjVfgrPOKg7YsTHD_li2SDAMHIed3seLP4ImvbAvO18gzx5HdVVMZJfv3Qd__t1x2D4Q_aDi90xdgI6etsHvdbtAYbF3wdHe9KFA7D0zN3ovVTiDU4aLyC4bNFItu2xCDAX29IzL2HubEAC8A6KrYL3svROYtBlobWt-odgNZ2s8lkU_BmiF46qiCBBE05tviFtGCCmNIkzblChCilTiROs0ixGTLgqxU6FZDb1oYbpGItMSkZOQdf2QZ8BSLihRkrClJ0BgxU3XEusBSJGI5XwczBw47X5aBQ4NmGoLv5-fQkOmzt-F3JXoFt91vrapg5VceNj5huAtrst |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3LT8IwGG8IHtQLChjf9uDRQR9btx0NwYACIQQTbqTPxKjMmO3iX2_bbUiMB2_dTl37tf2-9fcA4DaRjHGJUKBDQYLQYBkkRDjMFDLYCCYocmzk6YyNnsPHVbRqgLstF0Zr7cFnuuea_i5fZbJwv8r6XiyN2g13L7JVRVyyteroIV56rGJZ-n04jEjskouq_EI0Kolxka2aWIRrmpfzgCW1-lP1jCt9R4zS_ngwX5QYMIycy92OC4s_hB5aYFp3v8SevPaKXPTk1y9lx_9-3xHo_tD94Hx7kB2Dht60Qav2e4DV8m-Dwx3xwg5YeO5u8J4p_gaHpRsQXNR4JNv2aAQ44JvMcy_hwBmBVNA7yDcK3svMe4lBl4cWtu7vguXDcDkYBZVDQ_CSojygiLMwZTbjkDYQUKI0JXFqkFBCykjiEKsoJijhrk6xUyETm_wwk2iCeSxlQk9A0_ZBnwJIU8OMlDRRdgYMVqlJtcSaI2o0UmF6BjpuvNYfpQbHuhqq879f34D90XI6WU_Gs6cLcFDe-LsAvATN_LPQVzaRyMW1j59vjJi-fg |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2010+20th+International+Conference+on+Pattern+Recognition&rft.atitle=Multi-modal+Emotion+Recognition+Using+Canonical+Correlations+and+Acoustic+Features&rft.au=Gajsek%2C+Rok&rft.au=Struc%2C+Vitomir&rft.au=Mihelic%2C+France&rft.date=2010-08-01&rft.pub=IEEE&rft.isbn=9781424475421&rft.issn=1051-4651&rft.spage=4133&rft.epage=4136&rft_id=info:doi/10.1109%2FICPR.2010.1005&rft.externalDocID=5597732 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1051-4651&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1051-4651&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1051-4651&client=summon |