Multimodal Saliency and Fusion for Movie Summarization Based on Aural, Visual, and Textual Attention

Multimodal streams of sensory information are naturally parsed and integrated by humans using signal-level feature extraction and higher level cognitive processes. Detection of attention-invoking audiovisual segments is formulated in this work on the basis of saliency models for the audio, visual, a...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on multimedia Vol. 15; no. 7; pp. 1553 - 1568
Main Authors Evangelopoulos, Georgios, Zlatintsi, Athanasia, Potamianos, Alexandros, Maragos, Petros, Rapantzikos, Konstantinos, Skoumas, Georgios, Avrithis, Yannis
Format Journal Article
LanguageEnglish
Published New York, NY IEEE 01.11.2013
Institute of Electrical and Electronics Engineers
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN1520-9210
1941-0077
DOI10.1109/TMM.2013.2267205

Cover

Abstract Multimodal streams of sensory information are naturally parsed and integrated by humans using signal-level feature extraction and higher level cognitive processes. Detection of attention-invoking audiovisual segments is formulated in this work on the basis of saliency models for the audio, visual, and textual information conveyed in a video stream. Aural or auditory saliency is assessed by cues that quantify multifrequency waveform modulations, extracted through nonlinear operators and energy tracking. Visual saliency is measured through a spatiotemporal attention model driven by intensity, color, and orientation. Textual or linguistic saliency is extracted from part-of-speech tagging on the subtitles information available with most movie distributions. The individual saliency streams, obtained from modality-depended cues, are integrated in a multimodal saliency curve, modeling the time-varying perceptual importance of the composite video stream and signifying prevailing sensory events. The multimodal saliency representation forms the basis of a generic, bottom-up video summarization algorithm. Different fusion schemes are evaluated on a movie database of multimodal saliency annotations with comparative results provided across modalities. The produced summaries, based on low-level features and content-independent fusion and selection, are of subjectively high aesthetic and informative quality.
AbstractList Multimodal streams of sensory information are naturally parsed and integrated by humans using signal-level feature extraction and higher level cognitive processes. Detection of attention-invoking audiovisual segments is formulated in this work on the basis of saliency models for the audio, visual, and textual information conveyed in a video stream. Aural or auditory saliency is assessed by cues that quantify multifrequency waveform modulations, extracted through nonlinear operators and energy tracking. Visual saliency is measured through a spatiotemporal attention model driven by intensity, color, and orientation. Textual or linguistic saliency is extracted from part-of-speech tagging on the subtitles information available with most movie distributions. The individual saliency streams, obtained from modality-depended cues, are integrated in a multimodal saliency curve, modeling the time-varying perceptual importance of the composite video stream and signifying prevailing sensory events. The multimodal saliency representation forms the basis of a generic, bottom-up video summarization algorithm. Different fusion schemes are evaluated on a movie database of multimodal saliency annotations with comparative results provided across modalities. The produced summaries, based on low-level features and content-independent fusion and selection, are of subjectively high aesthetic and informative quality.
Author Zlatintsi, Athanasia
Evangelopoulos, Georgios
Rapantzikos, Konstantinos
Avrithis, Yannis
Skoumas, Georgios
Potamianos, Alexandros
Maragos, Petros
Author_xml – sequence: 1
  givenname: Georgios
  surname: Evangelopoulos
  fullname: Evangelopoulos, Georgios
  email: gevag@cs.ntua.gr
  organization: School of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece
– sequence: 2
  givenname: Athanasia
  surname: Zlatintsi
  fullname: Zlatintsi, Athanasia
  email: nzlat@cs.ntua.gr
  organization: School of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece
– sequence: 3
  givenname: Alexandros
  surname: Potamianos
  fullname: Potamianos, Alexandros
  email: potam@telecom.tuc.gr
  organization: Department of Electronics and Computer Engineering, Technical University of Crete, Chania, Greece
– sequence: 4
  givenname: Petros
  surname: Maragos
  fullname: Maragos, Petros
  email: maragos@cs.ntua.gr
  organization: School of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece
– sequence: 5
  givenname: Konstantinos
  surname: Rapantzikos
  fullname: Rapantzikos, Konstantinos
  email: rap@image.ntua.gr
  organization: School of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece
– sequence: 6
  givenname: Georgios
  surname: Skoumas
  fullname: Skoumas, Georgios
  email: iavr@image.ntua.gr
  organization: School of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece
– sequence: 7
  givenname: Yannis
  surname: Avrithis
  fullname: Avrithis, Yannis
  email: gskoumas@dblab.ece.ntua.gr
  organization: School of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=27861117$$DView record in Pascal Francis
BookMark eNp9kUFr3DAQRkVJoUnae6EXQynkUG9mRpZsH7ehaQtZcsi2VzOVx6DgtVPJLkl-feTukkMOPc1IvKcR852oo2EcRKn3CCtEqM-3m82KAPWKyJYE5pU6xrrAHKAsj1JvCPKaEN6okxhvAbAwUB6rdjP3k9-NLffZDfdeBveQ8dBml3P045B1Y8g2418v2c2823Hwjzwt9184SpulZj0H7j9nv3ycl7qoW7mf0iFbT5MMC_1Wve64j_LuUE_Vz8uv24vv-dX1tx8X66vcFQRTTgU7RudM57CoDbZUd8LG8W8Ddd06bnVlKyvAzmiNbSW6pBbQmLKQzmp9qs72796F8c8scWp2Pjrpex5knGODhS2MoaqChH58gd6OcxjS7xJVkLZEhhL16UBxdNx3gQfnY3MXfFrFQ0NlZRGxTBzsORfGGIN0zwhCs8TTpHiaJZ7mEE9S7AvF-enfbqfAvv-f-GEvehF5nmMNlZpIPwHs-J12
CODEN ITMUF8
CitedBy_id crossref_primary_10_1109_ACCESS_2023_3308967
crossref_primary_10_1109_TMM_2018_2839523
crossref_primary_10_1016_j_imavis_2021_104216
crossref_primary_10_1016_j_inffus_2023_02_028
crossref_primary_10_1109_TMM_2022_3141256
crossref_primary_10_1016_j_neucom_2021_04_072
crossref_primary_10_1111_psyp_70036
crossref_primary_10_1080_00087041_2024_2436687
crossref_primary_10_1109_TMM_2019_2918730
crossref_primary_10_1142_S0219649222500666
crossref_primary_10_1007_s11042_017_4807_6
crossref_primary_10_1016_j_patrec_2018_01_002
crossref_primary_10_1109_TMM_2017_2703939
crossref_primary_10_1109_TIP_2019_2936112
crossref_primary_10_4018_IJMDEM_2019070101
crossref_primary_10_1109_ACCESS_2017_2776344
crossref_primary_10_1186_s13640_017_0194_1
crossref_primary_10_1109_TMM_2018_2829162
crossref_primary_10_1049_ipr2_13310
crossref_primary_10_1016_j_actpsy_2024_104206
crossref_primary_10_1007_s11042_018_5969_6
crossref_primary_10_1016_j_engappai_2022_105667
crossref_primary_10_1007_s11042_016_4061_3
crossref_primary_10_1016_j_aca_2024_343302
crossref_primary_10_1109_TMM_2018_2876046
crossref_primary_10_1111_cgf_13654
crossref_primary_10_1007_s10994_021_06112_5
crossref_primary_10_1109_TMM_2020_2987682
crossref_primary_10_7498_aps_66_109501
crossref_primary_10_1016_j_neucom_2024_128270
crossref_primary_10_1145_3322240
crossref_primary_10_1109_MMUL_2018_2883127
crossref_primary_10_3390_app10093056
crossref_primary_10_1007_s42835_020_00461_2
crossref_primary_10_1016_j_ipm_2014_12_001
crossref_primary_10_1007_s10462_023_10429_z
crossref_primary_10_1016_j_neucom_2016_08_129
crossref_primary_10_1109_TMM_2018_2859590
crossref_primary_10_1016_j_engappai_2024_108844
crossref_primary_10_1016_j_knosys_2021_106970
crossref_primary_10_1016_j_patrec_2018_07_016
crossref_primary_10_1016_j_inffus_2021_04_016
crossref_primary_10_1145_2632267
crossref_primary_10_1109_TAFFC_2023_3265653
crossref_primary_10_1016_j_datak_2023_102150
crossref_primary_10_1109_TMM_2022_3157993
crossref_primary_10_1145_3617833
crossref_primary_10_1155_2016_7437860
crossref_primary_10_1109_TPAMI_2018_2798607
crossref_primary_10_1109_TITS_2016_2601655
crossref_primary_10_1109_TCSS_2024_3411486
crossref_primary_10_1145_3508361
crossref_primary_10_1109_TKDE_2021_3080293
crossref_primary_10_1371_journal_pone_0228579
crossref_primary_10_1007_s11704_021_0611_6
crossref_primary_10_1109_TMM_2023_3249481
crossref_primary_10_1109_LSP_2017_2775212
crossref_primary_10_1016_j_eswa_2019_01_003
crossref_primary_10_1016_j_jvcir_2024_104279
crossref_primary_10_1109_TMM_2019_2935678
crossref_primary_10_3389_fnins_2023_1173704
crossref_primary_10_1109_TIP_2016_2615289
crossref_primary_10_3389_frobt_2015_00028
crossref_primary_10_1007_s12559_015_9326_z
crossref_primary_10_3390_e24060764
crossref_primary_10_1109_TCSVT_2018_2844780
crossref_primary_10_1155_2019_3581419
crossref_primary_10_1142_S0219749923500041
crossref_primary_10_1109_JPROC_2021_3117472
crossref_primary_10_1007_s10844_016_0441_4
crossref_primary_10_1016_j_imavis_2021_104267
crossref_primary_10_1007_s12193_018_0268_0
crossref_primary_10_1007_s00521_024_09908_3
crossref_primary_10_1145_3656580
crossref_primary_10_1109_TMM_2017_2777665
crossref_primary_10_1109_TMM_2019_2940851
crossref_primary_10_1145_3445794
crossref_primary_10_1109_ACCESS_2022_3216890
crossref_primary_10_1109_TMM_2020_3006372
crossref_primary_10_1145_3584700
crossref_primary_10_3233_JIFS_223752
crossref_primary_10_1080_08839514_2025_2462382
crossref_primary_10_1016_j_procs_2015_03_209
crossref_primary_10_1109_TCSVT_2022_3203421
crossref_primary_10_1098_rstb_2016_0101
crossref_primary_10_1145_2996463
crossref_primary_10_1007_s11042_015_3210_4
crossref_primary_10_1007_s11042_021_10977_y
crossref_primary_10_1016_j_image_2015_08_004
crossref_primary_10_3390_ai1040030
crossref_primary_10_1016_j_inffus_2020_08_006
crossref_primary_10_1016_j_asoc_2016_03_022
crossref_primary_10_1016_j_image_2016_03_005
crossref_primary_10_1007_s00530_022_01040_3
crossref_primary_10_1109_TNNLS_2022_3161314
crossref_primary_10_3390_app11115260
crossref_primary_10_1016_j_jvcir_2017_02_005
crossref_primary_10_1109_ACCESS_2019_2955637
crossref_primary_10_1109_JTEHM_2018_2863386
crossref_primary_10_1109_TIP_2020_2966082
crossref_primary_10_1109_TCDS_2021_3094974
crossref_primary_10_1038_s42256_022_00488_2
crossref_primary_10_1016_j_image_2019_05_001
crossref_primary_10_1109_TAFFC_2024_3354382
crossref_primary_10_1145_3347712
crossref_primary_10_1109_TMM_2018_2794265
crossref_primary_10_1109_TKDE_2018_2848260
crossref_primary_10_1007_s11042_016_3577_x
crossref_primary_10_1364_JOSAA_34_000814
crossref_primary_10_1049_ipr2_12960
crossref_primary_10_1016_j_iintel_2023_100061
crossref_primary_10_1016_j_dsp_2018_03_010
crossref_primary_10_1016_j_neucom_2023_03_013
crossref_primary_10_1109_TPAMI_2023_3325770
crossref_primary_10_1109_TMM_2018_2844689
crossref_primary_10_3390_ijgi10100636
crossref_primary_10_1007_s41095_015_0015_3
crossref_primary_10_1109_TMM_2019_2929943
crossref_primary_10_1109_TMM_2019_2947352
crossref_primary_10_1016_j_jksuci_2022_09_005
crossref_primary_10_3389_fneur_2024_1444795
crossref_primary_10_1007_s00498_017_0207_8
crossref_primary_10_1109_TPAMI_2022_3171983
crossref_primary_10_1007_s11042_014_2126_8
Cites_doi 10.1371/journal.pbio.1000129
10.1109/TASL.2010.2047756
10.1109/TIP.2009.2030969
10.1038/35058500
10.1145/345508.345566
10.1109/79.888862
10.3389/fnhum.2010.00168
10.1007/978-0-387-71305-2_5
10.1613/jair.1523
10.1109/ICASSP.2011.5946961
10.1109/TASL.2009.2014795
10.1109/76.809162
10.1109/TCSVT.2007.890857
10.1109/LSP.2005.853050
10.1016/j.jvcir.2007.04.002
10.1037/0033-295X.113.4.766
10.1109/TPAMI.2009.27
10.1162/neco.2007.19.10.2780
10.1016/j.patrec.2010.02.005
10.1016/S0042-6989(01)00250-4
10.1109/MSP.2006.1621451
10.1109/TPAMI.2011.53
10.1007/978-3-642-00958-7_37
10.1146/annurev.neuro.30.051606.094256
10.1023/A:1012460413855
10.1109/TMM.2005.854410
10.1007/978-3-540-30586-6_70
10.1109/TCSVT.2004.841694
10.1146/annurev.ne.13.030190.000325
10.1109/ICCV.2001.937645
10.1016/0306-4573(88)90021-0
10.1109/TASL.2006.872625
10.1007/11526346_1
10.1109/78.258071
10.1145/265563.265572
10.1109/ICME.2004.1394309
10.1016/j.jvcir.2010.01.007
10.1109/ICIP.2010.5650991
10.1109/CVPR.1997.609414
10.1016/j.cub.2005.09.040
10.1016/j.tics.2004.08.008
10.1145/1198302.1198305
10.1109/TIP.2011.2156803
10.1121/1.414997
10.1002/9781118219546.ch21
10.1109/ICASSP.2009.4960393
10.1109/CVPR.2009.5206596
10.1109/34.730558
10.1007/s12559-011-9097-0
10.1007/s00530-010-0182-0
10.1007/3-540-36127-8_20
10.1038/nrn1411
10.1016/j.conb.2007.07.011
10.1167/9.3.5
10.1145/215206.215333
10.1006/csla.1998.0043
10.1109/78.277799
10.1109/CVPR.2009.5206525
10.1109/WIIAT.2008.175
10.1109/TNN.2004.832710
10.1146/annurev.ne.18.030195.001205
ContentType Journal Article
Copyright 2015 INIST-CNRS
Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Nov 2013
Copyright_xml – notice: 2015 INIST-CNRS
– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Nov 2013
DBID 97E
RIA
RIE
AAYXX
CITATION
IQODW
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
7U5
F28
FR3
DOI 10.1109/TMM.2013.2267205
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Pascal-Francis
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Solid State and Superconductivity Abstracts
ANTE: Abstracts in New Technology & Engineering
Engineering Research Database
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
Solid State and Superconductivity Abstracts
Engineering Research Database
ANTE: Abstracts in New Technology & Engineering
DatabaseTitleList Technology Research Database

Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
Applied Sciences
EISSN 1941-0077
EndPage 1568
ExternalDocumentID 3100663611
27861117
10_1109_TMM_2013_2267205
6527322
Genre orig-research
GroupedDBID -~X
0R~
29I
4.4
5GY
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
HZ~
H~9
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNS
TN5
VH1
ZY4
AAYXX
CITATION
ABTAH
IQODW
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
7U5
F28
FR3
ID FETCH-LOGICAL-c420t-24aca1cc5fc14951d29fea5cab5099dcad38686e0ac5331d8e372d015574ef633
IEDL.DBID RIE
ISSN 1520-9210
IngestDate Mon Sep 29 06:41:06 EDT 2025
Mon Jun 30 04:22:39 EDT 2025
Wed Apr 02 07:21:45 EDT 2025
Wed Oct 01 01:33:21 EDT 2025
Thu Apr 24 23:00:07 EDT 2025
Wed Aug 27 06:27:50 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 7
Keywords Space time correlation
Tracking
multimodal saliency
Video signal
Modeling
Audiovisual
multistream processing
Linguistics
Selection criterion
text saliency
Multimodal interface
Audiovisual equipment
audio saliency
Pattern extraction
video summarization
Textual data
Streaming
Computer vision
Abstract
movie summarization
Attention
Model driven architecture
Text
Annotation
Grammatical inference
fusion
Dimension reduction
Cognitive theory
Hearing
visual saliency
Multimodality
Bottom up method
Stimulus salience
Data fusion
Feature extraction
Visual information
Visual attention
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
CC BY 4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c420t-24aca1cc5fc14951d29fea5cab5099dcad38686e0ac5331d8e372d015574ef633
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
content type line 23
PQID 1442362252
PQPubID 75737
PageCount 16
ParticipantIDs proquest_journals_1442362252
crossref_citationtrail_10_1109_TMM_2013_2267205
pascalfrancis_primary_27861117
crossref_primary_10_1109_TMM_2013_2267205
proquest_miscellaneous_1464552880
ieee_primary_6527322
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2013-11-01
PublicationDateYYYYMMDD 2013-11-01
PublicationDate_xml – month: 11
  year: 2013
  text: 2013-11-01
  day: 01
PublicationDecade 2010
PublicationPlace New York, NY
PublicationPlace_xml – name: New York, NY
– name: Piscataway
PublicationTitle IEEE transactions on multimedia
PublicationTitleAbbrev TMM
PublicationYear 2013
Publisher IEEE
Institute of Electrical and Electronics Engineers
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: Institute of Electrical and Electronics Engineers
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref57
ref13
ref56
ref12
ref59
ref15
ref58
ref14
ref53
ref52
ref55
ref54
gong (ref22) 2000
ref17
ref16
derpanis (ref64) 2005
ref18
koch (ref6) 1985; 4
ref51
ref50
ref46
ref45
ref48
ref47
ref42
ref41
ref43
jurafsky (ref70) 2008
ref49
ref8
ref7
ref9
ref4
ref3
ref5
zhuang (ref19) 1998
ref40
ref34
ref37
ref36
ref31
ref74
ref30
ref33
schmid (ref69) 1994
ref32
ref2
ref1
ref39
deng (ref67) 2000
zlatintsi (ref75) 2012
coensel (ref44) 2010
ponceleon (ref26) 1999
guo (ref35) 2010; 19
ref71
ref73
ref72
gao (ref38) 2009; 31
ref68
ref24
ref23
ma (ref10) 2005; 7
ref25
ref63
ref66
ref21
uchihashi (ref20) 1999
ref28
ref27
ngo (ref11) 2005; 15
ref29
hering (ref62) 1964
ref60
pellom (ref65) 2001
yuille (ref61) 1996
References_xml – ident: ref5
  doi: 10.1371/journal.pbio.1000129
– start-page: 1294
  year: 2012
  ident: ref75
  article-title: A saliency-based approach to audio event detection and summarization
  publication-title: Proc 20th Eur Signal Process Conf
– ident: ref68
  doi: 10.1109/TASL.2010.2047756
– volume: 19
  start-page: 185
  year: 2010
  ident: ref35
  article-title: A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression
  publication-title: IEEE Trans Image Process
  doi: 10.1109/TIP.2009.2030969
– ident: ref7
  doi: 10.1038/35058500
– ident: ref55
  doi: 10.1145/345508.345566
– ident: ref14
  doi: 10.1109/79.888862
– ident: ref9
  doi: 10.3389/fnhum.2010.00168
– ident: ref43
  doi: 10.1007/978-0-387-71305-2_5
– ident: ref54
  doi: 10.1613/jair.1523
– ident: ref73
  doi: 10.1109/ICASSP.2011.5946961
– ident: ref45
  doi: 10.1109/TASL.2009.2014795
– ident: ref21
  doi: 10.1109/76.809162
– ident: ref12
  doi: 10.1109/TCSVT.2007.890857
– ident: ref49
  doi: 10.1109/LSP.2005.853050
– ident: ref18
  doi: 10.1016/j.jvcir.2007.04.002
– ident: ref31
  doi: 10.1037/0033-295X.113.4.766
– volume: 31
  start-page: 989
  year: 2009
  ident: ref38
  article-title: Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition
  publication-title: IEEE Trans Pattern Anal Mach Intell
  doi: 10.1109/TPAMI.2009.27
– start-page: iii
  year: 2005
  ident: ref64
  article-title: Three-dimensional <formula formulatype="inline"><tex Notation="TeX">$n$</tex></formula>th derivative of Gaussian separable steerable filters
  publication-title: Proc IEEE Int Conf Image Process
– ident: ref72
  doi: 10.1162/neco.2007.19.10.2780
– ident: ref46
  doi: 10.1016/j.patrec.2010.02.005
– ident: ref29
  doi: 10.1016/S0042-6989(01)00250-4
– ident: ref15
  doi: 10.1109/MSP.2006.1621451
– ident: ref30
  doi: 10.1109/TPAMI.2011.53
– ident: ref51
  doi: 10.1007/978-3-642-00958-7_37
– ident: ref2
  doi: 10.1146/annurev.neuro.30.051606.094256
– start-page: 806
  year: 2000
  ident: ref67
  article-title: Large-vocabulary speech recognition under adverse acoustic environments
  publication-title: Proc Int Conf Spoken Language Process
– ident: ref36
  doi: 10.1023/A:1012460413855
– volume: 7
  start-page: 907
  year: 2005
  ident: ref10
  article-title: A generic framework of user attention model and its application in video summarization
  publication-title: IEEE Trans Multimedia
  doi: 10.1109/TMM.2005.854410
– start-page: 866
  year: 1998
  ident: ref19
  article-title: Adaptive key frame extraction using unsupervised clustering
  publication-title: Proc IEEE Int Conf Image Process
– year: 2008
  ident: ref70
  publication-title: Speech and Language Processing
– ident: ref52
  doi: 10.1007/978-3-540-30586-6_70
– volume: 15
  start-page: 296
  year: 2005
  ident: ref11
  article-title: Video summarization and scene detection by graph modeling
  publication-title: IEEE Trans Circuits Syst Video Technol
  doi: 10.1109/TCSVT.2004.841694
– start-page: 174
  year: 2000
  ident: ref22
  article-title: Video summarization using singular value decomposition
  publication-title: Proc IEEE Conf Comput Vis Pattern Recognit
– ident: ref1
  doi: 10.1146/annurev.ne.13.030190.000325
– ident: ref23
  doi: 10.1109/ICCV.2001.937645
– start-page: 44
  year: 1994
  ident: ref69
  article-title: Probabilistic part-of-speech tagging using decision trees
  publication-title: Proc Int Conf New Methods Language Process
– ident: ref50
  doi: 10.1016/0306-4573(88)90021-0
– ident: ref48
  doi: 10.1109/TASL.2006.872625
– ident: ref27
  doi: 10.1007/11526346_1
– ident: ref60
  doi: 10.1109/78.258071
– ident: ref24
  doi: 10.1145/265563.265572
– ident: ref28
  doi: 10.1109/ICME.2004.1394309
– ident: ref17
  doi: 10.1016/j.jvcir.2010.01.007
– start-page: 123
  year: 1996
  ident: ref61
  publication-title: Bayesian Decision Theory and Psychophysics
– ident: ref39
  doi: 10.1109/ICIP.2010.5650991
– ident: ref25
  doi: 10.1109/CVPR.1997.609414
– ident: ref8
  doi: 10.1016/j.cub.2005.09.040
– ident: ref59
  doi: 10.1016/j.tics.2004.08.008
– ident: ref16
  doi: 10.1145/1198302.1198305
– ident: ref32
  doi: 10.1109/TIP.2011.2156803
– ident: ref47
  doi: 10.1121/1.414997
– ident: ref74
  doi: 10.1002/9781118219546.ch21
– start-page: 383
  year: 1999
  ident: ref20
  article-title: Video Manga: generating semantically meaningful video summaries
  publication-title: Proc 7th ACM Int Conf Multimedia
– ident: ref13
  doi: 10.1109/ICASSP.2009.4960393
– start-page: 887
  year: 2010
  ident: ref44
  article-title: A model of saliency-based auditory attention to environmental sound
  publication-title: Proc 20th Int Congress Acoust
– ident: ref34
  doi: 10.1109/CVPR.2009.5206596
– ident: ref63
  doi: 10.1109/34.730558
– ident: ref41
  doi: 10.1007/s12559-011-9097-0
– ident: ref71
  doi: 10.1007/s00530-010-0182-0
– ident: ref57
  doi: 10.1007/3-540-36127-8_20
– year: 2001
  ident: ref65
  publication-title: ?SONIC The University of colorado continuous speech recognizer ?
– ident: ref33
  doi: 10.1038/nrn1411
– ident: ref3
  doi: 10.1016/j.conb.2007.07.011
– volume: 4
  start-page: 219
  year: 1985
  ident: ref6
  article-title: Shifts in selective visual attention: towards the underlying neural circuitry
  publication-title: Human Neurobiol
– ident: ref37
  doi: 10.1167/9.3.5
– ident: ref56
  doi: 10.1145/215206.215333
– ident: ref66
  doi: 10.1006/csla.1998.0043
– start-page: 199
  year: 1999
  ident: ref26
  article-title: CueVideo automated multimedia indexing and retrieval
  publication-title: Proc ACM Int l Conf Multimedia
– ident: ref58
  doi: 10.1109/78.277799
– ident: ref40
  doi: 10.1109/CVPR.2009.5206525
– ident: ref53
  doi: 10.1109/WIIAT.2008.175
– year: 1964
  ident: ref62
  publication-title: Outlines of a Theory of the Light Sense
– ident: ref42
  doi: 10.1109/TNN.2004.832710
– ident: ref4
  doi: 10.1146/annurev.ne.18.030195.001205
SSID ssj0014507
Score 2.5340168
Snippet Multimodal streams of sensory information are naturally parsed and integrated by humans using signal-level feature extraction and higher level cognitive...
SourceID proquest
pascalfrancis
crossref
ieee
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 1553
SubjectTerms Algorithms
Applied sciences
Artificial intelligence
Attention
audio saliency
Biological and medical sciences
Computational modeling
Computer science; control theory; systems
Cues
Data processing. List processing. Character string processing
Exact sciences and technology
Feature extraction
Fundamental and applied biological sciences. Psychology
fusion
Memory organisation. Data processing
Modulation
Motion pictures
movie summarization
multimodal saliency
multistream processing
Pattern recognition. Digital image processing. Computational geometry
Perception
Psychology. Psychoanalysis. Psychiatry
Psychology. Psychophysiology
Semantics
Software
Speech and sound recognition and synthesis. Linguistics
Streaming media
Streams
Task analysis
text saliency
video summarization
Vision
Visual
visual saliency
Visualization
Waveforms
Title Multimodal Saliency and Fusion for Movie Summarization Based on Aural, Visual, and Textual Attention
URI https://ieeexplore.ieee.org/document/6527322
https://www.proquest.com/docview/1442362252
https://www.proquest.com/docview/1464552880
Volume 15
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1941-0077
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014507
  issn: 1520-9210
  databaseCode: RIE
  dateStart: 19990101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwEB6VnuBAoQWxUCojcUFqdhPHcZxjQawqpHDpFvUWOfZEQtAEscmFX8-M81B5CHFKpNiKkxl7vrHnmwF4jalqjHd1RNZOkoOSYGSKOo0Sh6k1aJo80MfKj_ryWn24yW4O4HzhwiBiCD7DNd-Gs3zfuYG3yjaas4VJWnDv5UaPXK3lxEBlgRpN5iiOCvJj5iPJuNjsypJjuNI1QY1ccqG6OyYo1FThiEi7p5_SjNUs_liYg7XZHkE5j3MMMvmyHvp67X78lsLxfz_kETycYKe4GPXkMRxgewxHc0kHMc3wY3hwJz_hCfhAz73tPHW9IsDONE1hWy-2A2-yCQK8ouzIsoqrwIGbOJ3iLZlGL-jmgpN6nItPn_cDX7nrjqzBwEPp-zHS8glcb9_v3l1GU1mGyCkZ95FU1tnEuaxx7F4lXhYN2szZmsBH4Z31qdFGY2wdYcnEG0xz6Rmb5QobnaZP4bDtWnwGwnjtEqfIqfKZSlBbpxNbs5eK5Hl6vYLNLKnKTTnLuXTG1yr4LnFRkWwrlm01yXYFb5Ye38Z8Hf9oe8KiWdpNUlnB2S_KsDyXpHVkGvIVnM7aUU0zfk8uFAFTTasj9X-1PKa5ygcwtsVu4DZaZZmkJfP531_9Au7zAEeu4ykc9t8HfEmgp6_Pgrb_BOTO_TY
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwEB5V5QAcKLQgFkoxEhekZjdxbMc5FsRqgaaXblFvkWM7EgISxCYXfj0zzkPlIcQplmwrTsb2fGPPNwPw0qei1s5WEWo7jgZK4iOdV2mUWJ8a7XWdBfpYcaE2V-L9tbzeg9OZC-O9D85nfknFcJfvWtvTUdlKUbQwjhvuLSmEkANba74zEDKQo1EhxVGOlsx0KRnnq21RkBdXukSwkXFKVXdDCYWsKuQTaXb4W-ohn8UfW3PQN-sDKKaRDm4mn5d9Vy3tj9-COP7vp9yHeyPwZGfDTHkAe745hIMpqQMb1_gh3L0RofAIXCDofm0ddr1EyE5ETWYax9Y9HbMxhLysaFG3ssvAghtZnew1KkfHsHBGYT1O2cdPu56e1HWL-qCnoXTd4Gv5EK7Wb7dvNtGYmCGygsddxIWxJrFW1pYMrMTxvPZGWlMh_MidNS7VSisfG4toMnHapxl3hM4y4WuVpo9gv2kb_xiYdsomVqBZ5aRIvDJWJaYiO9Wj7enUAlaTpEo7Ri2n5BlfymC9xHmJsi1JtuUo2wW8mnt8GyJ2_KPtEYlmbjdKZQEnv0yGuZ5nWqFyyBZwPM2OclzzOzSiEJoq3B-x_4u5GlcrXcGYxrc9tVFCSo6b5pO_v_o53N5si_Py_N3Fh6dwhwY7MB-PYb_73vtnCIG66iTM_J9IFQCS
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Multimodal+Saliency+and+Fusion+for+Movie+Summarization+Based+on+Aural%2C+Visual%2C+and+Textual+Attention&rft.jtitle=IEEE+transactions+on+multimedia&rft.au=Evangelopoulos%2C+Georgios&rft.au=Zlatintsi%2C+Athanasia&rft.au=Potamianos%2C+Alexandros&rft.au=Maragos%2C+Petros&rft.date=2013-11-01&rft.issn=1520-9210&rft.eissn=1941-0077&rft.volume=15&rft.issue=7&rft.spage=1553&rft.epage=1568&rft_id=info:doi/10.1109%2FTMM.2013.2267205&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TMM_2013_2267205
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1520-9210&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1520-9210&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1520-9210&client=summon