Use of bimodal coherence to resolve the permutation problem in convolutive BSS

Recent studies show that facial information contained in visual speech can be helpful for the performance enhancement of audio-only blind source separation (BSS) algorithms. Such information is exploited through the statistical characterization of the coherence between the audio and visual speech us...

Full description

Saved in:
Bibliographic Details
Published inSignal processing Vol. 92; no. 8; pp. 1916 - 1927
Main Authors Liu, Qingju, Wang, Wenwu, Jackson, Philip
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.08.2012
Subjects
Online AccessGet full text
ISSN0165-1684
1872-7557
1872-7557
DOI10.1016/j.sigpro.2011.11.007

Cover

Abstract Recent studies show that facial information contained in visual speech can be helpful for the performance enhancement of audio-only blind source separation (BSS) algorithms. Such information is exploited through the statistical characterization of the coherence between the audio and visual speech using, e.g., a Gaussian mixture model (GMM). In this paper, we present three contributions. With the synchronized features, we propose an adapted expectation maximization (AEM) algorithm to model the audio–visual coherence in the off-line training process. To improve the accuracy of this coherence model, we use a frame selection scheme to discard nonstationary features. Then with the coherence maximization technique, we develop a new sorting method to solve the permutation problem in the frequency domain. We test our algorithm on a multimodal speech database composed of different combinations of vowels and consonants. The experimental results show that our proposed algorithm outperforms traditional audio-only BSS, which confirms the benefit of using visual speech to assist in separation of the audio.
AbstractList Recent studies show that facial information contained in visual speech can be helpful for the performance enhancement of audio-only blind source separation (BSS) algorithms. Such information is exploited through the statistical characterization of the coherence between the audio and visual speech using, e.g., a Gaussian mixture model (GMM). In this paper, we present three contributions. With the synchronized features, we propose an adapted expectation maximization (AEM) algorithm to model the audio-visual coherence in the off-line training process. To improve the accuracy of this coherence model, we use a frame selection scheme to discard nonstationary features. Then with the coherence maximization technique, we develop a new sorting method to solve the permutation problem in the frequency domain. We test our algorithm on a multimodal speech database composed of different combinations of vowels and consonants. The experimental results show that our proposed algorithm outperforms traditional audio-only BSS, which confirms the benefit of using visual speech to assist in separation of the audio.
Author Liu, Qingju
Jackson, Philip
Wang, Wenwu
Author_xml – sequence: 1
  givenname: Qingju
  surname: Liu
  fullname: Liu, Qingju
  email: Q.Liu@surrey.ac.uk
– sequence: 2
  givenname: Wenwu
  surname: Wang
  fullname: Wang, Wenwu
  email: W.Wang@surrey.ac.uk
– sequence: 3
  givenname: Philip
  surname: Jackson
  fullname: Jackson, Philip
  email: P.Jackson@surrey.ac.uk
BookMark eNqNkctOxCAUhonRxPHyBi66dNMRmHKpCxM13hKjC3VNKD0oE1pGaMf49jKpKxdqchJI-P5z4GMPbfehB4SOCJ4TTPjJcp7c6yqGOcWEzHNhLLbQjEhBS8GY2EazjLGScFntor2UlhhjsuB4hh5eEhTBFo3rQqt9YcIbROgNFEMoIqTg13n7BsUKYjcOenChL_KoxkNXuD7z_Tr4cXAZu3h6OkA7VvsEh9_rPnq5vnq-vC3vH2_uLs_vS7OQbCgryWvLaksbWQlKbCNbW4u6pZUlojKU1rWsBeFGNpqxtqFcc2mp5VBj0xK82Eds6jv2K_35ob1Xq-g6HT8VwWojRS3VJEVtpKhcWUrOHU-5fPA-QhpU55IB73UPYUyKcEEWUsiKZfR0Qk0MKUWwyrjp_UPUzv81p_oR_uf1zqYYZHdrB1El4za_0boIZlBtcL83-ALHvqII
CitedBy_id crossref_primary_10_1109_TASLP_2019_2928140
crossref_primary_10_3389_frobt_2020_00085
crossref_primary_10_1049_iet_spr_2018_5132
crossref_primary_10_1007_s11042_014_2199_4
crossref_primary_10_1109_TSP_2013_2277834
crossref_primary_10_1109_LSP_2018_2853566
crossref_primary_10_1109_MSP_2013_2296173
crossref_primary_10_1186_s13636_019_0164_x
crossref_primary_10_1016_j_dsp_2014_04_009
Cites_doi 10.1109/TASL.2006.872623
10.1109/TNN.2009.2032182
10.1016/0165-1684(94)90029-9
10.1109/JSTSP.2010.2057198
10.1109/TASL.2008.2005349
10.1121/1.3050257
10.1109/TSA.2005.851925
10.1109/TMM.2010.2050650
10.1109/78.554307
10.1109/TSA.2004.832994
10.1109/78.258082
10.1162/neco.1997.9.7.1483
10.1109/72.701177
10.1049/ip-f-2.1993.0054
10.1155/S1110865702207015
10.1016/0165-1684(91)90079-X
10.1162/neco.1995.7.6.1129
10.1121/1.1358887
10.1016/j.cognition.2004.01.006
10.1109/LSP.2005.863638
10.1109/TASL.2006.872619
ContentType Journal Article
Copyright 2011 Elsevier B.V.
Copyright_xml – notice: 2011 Elsevier B.V.
DBID AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ADTOC
UNPAY
DOI 10.1016/j.sigpro.2011.11.007
DatabaseName CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Unpaywall for CDI: Periodical Content
Unpaywall
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1872-7557
EndPage 1927
ExternalDocumentID 10.1016/j.sigpro.2011.11.007
10_1016_j_sigpro_2011_11_007
S0165168411003926
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
123
1B1
1~.
1~5
4.4
457
4G.
53G
5VS
7-5
71M
8P~
9JN
AABNK
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABFNM
ABFRF
ABMAC
ABXDB
ABYKQ
ACDAQ
ACGFO
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMUD
ADTZH
AEBSH
AECPX
AEFWE
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F0J
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
G8K
GBLVA
GBOLZ
HLZ
HVGLF
HZ~
IHE
J1W
JJJVA
KOM
LG9
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SBC
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SST
SSV
SSZ
T5K
TAE
TN5
WUQ
XPP
ZMT
~02
~G-
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABJNI
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ADTOC
AGCQF
UNPAY
ID FETCH-LOGICAL-c385t-4869f59f2b84721fb8df979d24f174c229989716c8ba55db26a68f2f6e90cd103
IEDL.DBID .~1
ISSN 0165-1684
1872-7557
IngestDate Tue Aug 19 17:34:04 EDT 2025
Sat Sep 27 22:44:10 EDT 2025
Wed Oct 01 02:46:25 EDT 2025
Thu Apr 24 23:01:42 EDT 2025
Fri Feb 23 02:28:10 EST 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 8
Keywords Gaussian mixture model (GMM)
Audio–visual coherence
Adapted expectation maximization (AEM)
Indeterminacy
Feature selection and fusion
Convolutive blind source separation (BSS)
Language English
License https://www.elsevier.com/tdm/userlicense/1.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c385t-4869f59f2b84721fb8df979d24f174c229989716c8ba55db26a68f2f6e90cd103
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
OpenAccessLink https://proxy.k.utb.cz/login?url=https://www.sciencedirect.com/science/article/pii/S0165168411003926
PQID 1671387845
PQPubID 23500
PageCount 12
ParticipantIDs unpaywall_primary_10_1016_j_sigpro_2011_11_007
proquest_miscellaneous_1671387845
crossref_citationtrail_10_1016_j_sigpro_2011_11_007
crossref_primary_10_1016_j_sigpro_2011_11_007
elsevier_sciencedirect_doi_10_1016_j_sigpro_2011_11_007
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate August 2012
2012-8-00
20120801
PublicationDateYYYYMMDD 2012-08-01
PublicationDate_xml – month: 08
  year: 2012
  text: August 2012
PublicationDecade 2010
PublicationTitle Signal processing
PublicationYear 2012
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Sodoyer, Rivet, Girin, Savariaux, Schwartz, Jutten (bib34) 2009; 125
Girin, Schwartz, Feng (bib10) 2001; 109
Nesta, Omologo, Svaizer (bib28) 2008
Ormoneit, Tresp (bib7) 1998; 9
Potamianos, Neti, Gravier, Garg, Senior (bib8) 2003
Naqvi, Yu, Chambers (bib16) 2010; 4
Rivet, Girin, Jutten (bib12) 2007; 15
Belouchrani, Abed-Meraim, Cardoso, Moulines (bib21) 1997; 45
Liu, Wang, Jackson (bib15) 2010
Rahbar, Peilly (bib33) 2005; 13
Schwartz, Berthommier, Savariaux (bib2) 2004; 93
Amari, Douglas, Cichocki, Yang (bib23) 1997
Thomas, Deville, Hosseini (bib24) 2006; 13
Bulkin, Groh (bib1) 2006; 16
Hyvärinen, Oja (bib22) 1997; 9
Cardoso, Souloumiac (bib19) 1993; 140
Wang, Cosker, Hicks, Sanei, Chambers (bib13) 2005
Mazur, Mertins (bib32) 2009; 17
Mei, Xi, Yin, Mertins, Chicharo (bib25) 2006; 14
Gurban (bib9) 2006
Comon (bib17) 1994; 36
Jeub, Schafer, Vary (bib35) 2009
Bell, Sejnowski (bib20) 1995; 7
Matsuoka (bib29) 2002; vol. 4
Pham, Servière, Boumaraf (bib30) 2003; vol. 2
XM2VTS, Website.
.
Liu, Wang, Jackson (bib11) 2010
Liu, Wang, Jackson (bib14) 2010
Mallat, Zhang (bib6) 1993; 41
Jutten, Herault (bib18) 1991; 24
Anemüller, Kollmeier (bib26) 2000
Sawada, Mukai, Araki, Makino (bib31) 2004; 12
Sodoyer, Schwartz, Girin, Klinkisch, Jutten (bib3) 2002; 2002
Monaci, Vandergheynst, Sommer (bib4) 2009; 20
Casanovas, Monaci, Vandergheynst, Gribonval (bib5) 2010; 12
Ikram, Morgan (bib27) 2002
Mei (10.1016/j.sigpro.2011.11.007_bib25) 2006; 14
Liu (10.1016/j.sigpro.2011.11.007_bib11) 2010
Sodoyer (10.1016/j.sigpro.2011.11.007_bib34) 2009; 125
Girin (10.1016/j.sigpro.2011.11.007_bib10) 2001; 109
Mallat (10.1016/j.sigpro.2011.11.007_bib6) 1993; 41
Ormoneit (10.1016/j.sigpro.2011.11.007_bib7) 1998; 9
Matsuoka (10.1016/j.sigpro.2011.11.007_bib29) 2002; vol. 4
Ikram (10.1016/j.sigpro.2011.11.007_bib27) 2002
Wang (10.1016/j.sigpro.2011.11.007_bib13) 2005
Jeub (10.1016/j.sigpro.2011.11.007_bib35) 2009
Naqvi (10.1016/j.sigpro.2011.11.007_bib16) 2010; 4
Liu (10.1016/j.sigpro.2011.11.007_bib14) 2010
Cardoso (10.1016/j.sigpro.2011.11.007_bib19) 1993; 140
Jutten (10.1016/j.sigpro.2011.11.007_bib18) 1991; 24
Bulkin (10.1016/j.sigpro.2011.11.007_bib1) 2006; 16
Sodoyer (10.1016/j.sigpro.2011.11.007_bib3) 2002; 2002
Amari (10.1016/j.sigpro.2011.11.007_bib23) 1997
Thomas (10.1016/j.sigpro.2011.11.007_bib24) 2006; 13
Potamianos (10.1016/j.sigpro.2011.11.007_bib8) 2003
Sawada (10.1016/j.sigpro.2011.11.007_bib31) 2004; 12
Hyvärinen (10.1016/j.sigpro.2011.11.007_bib22) 1997; 9
Rahbar (10.1016/j.sigpro.2011.11.007_bib33) 2005; 13
Monaci (10.1016/j.sigpro.2011.11.007_bib4) 2009; 20
Mazur (10.1016/j.sigpro.2011.11.007_bib32) 2009; 17
Anemüller (10.1016/j.sigpro.2011.11.007_bib26) 2000
Liu (10.1016/j.sigpro.2011.11.007_bib15) 2010
Gurban (10.1016/j.sigpro.2011.11.007_bib9) 2006
Pham (10.1016/j.sigpro.2011.11.007_bib30) 2003; vol. 2
10.1016/j.sigpro.2011.11.007_bib36
Bell (10.1016/j.sigpro.2011.11.007_bib20) 1995; 7
Schwartz (10.1016/j.sigpro.2011.11.007_bib2) 2004; 93
Casanovas (10.1016/j.sigpro.2011.11.007_bib5) 2010; 12
Rivet (10.1016/j.sigpro.2011.11.007_bib12) 2007; 15
Comon (10.1016/j.sigpro.2011.11.007_bib17) 1994; 36
Nesta (10.1016/j.sigpro.2011.11.007_bib28) 2008
Belouchrani (10.1016/j.sigpro.2011.11.007_bib21) 1997; 45
References_xml – volume: 45
  start-page: 434
  year: 1997
  end-page: 444
  ident: bib21
  article-title: A blind source separation technique using second-order statistics
  publication-title: IEEE Transactions on Signal Processing
– volume: 140
  start-page: 362
  year: 1993
  end-page: 370
  ident: bib19
  article-title: Blind beamforming for non-Gaussian signals
  publication-title: IEE Proceedings. Part F: Radar and Signal Processing
– volume: 109
  start-page: 3007
  year: 2001
  end-page: 3020
  ident: bib10
  article-title: Audio–visual enhancement of speech in noise
  publication-title: The Journal of the Acoustical Society of America
– volume: 12
  start-page: 358
  year: 2010
  end-page: 371
  ident: bib5
  article-title: Blind audiovisual source separation based on sparse redundant representations
  publication-title: IEEE Transactions on Multimedia
– volume: 13
  start-page: 228
  year: 2006
  end-page: 231
  ident: bib24
  article-title: Time-domain fast fixed-point algorithms for convolutive ICA
  publication-title: IEEE Signal Processing Letters
– year: 2008
  ident: bib28
  article-title: A novel robust solution to the permutation problem based on a joint multiple TDOA estimation
  publication-title: Proceedings of IWAENC
– start-page: 101
  year: 1997
  end-page: 104
  ident: bib23
  article-title: Multichannel blind deconvolution and equalization using the natural gradient
  publication-title: Proceedings of IEEE International Workshop on Wireless Communication
– volume: 15
  start-page: 96
  year: 2007
  end-page: 108
  ident: bib12
  article-title: Mixing audiovisual speech processing and blind source separation for the extraction of speech signals from convolutive mixtures
  publication-title: IEEE Transactions on Audio, Speech and Language Processing
– volume: 7
  start-page: 1129
  year: 1995
  end-page: 1159
  ident: bib20
  article-title: An information-maximization approach to blind separation and blind deconvolution
  publication-title: Neural Computation
– volume: 2002
  start-page: 1165
  year: 2002
  end-page: 1173
  ident: bib3
  article-title: Separation of audio–visual speech sources: a new approach exploiting the audio–visual coherence of speech stimuli
  publication-title: EURASIP Journal on Applied Signal Processing
– start-page: 438
  year: 2010
  end-page: 441
  ident: bib11
  article-title: Bimodal coherence based scale ambiguity cancellation for target speech extraction and enhancement
  publication-title: Proceedings of Interspeech
– volume: 12
  start-page: 1063
  year: 2004
  end-page: 6676
  ident: bib31
  article-title: A robust and precise method for solving the permutation problem of frequency-domain blind source separation
  publication-title: IEEE Transactions on Speech and Audio Processing
– year: 2010
  ident: bib14
  article-title: Audio–visual convolutive blind source separation
  publication-title: Proceedings of Sensor Signal Processing for Defence (SSPD)
– start-page: 215
  year: 2000
  end-page: 220
  ident: bib26
  article-title: Amplitude modulation decorrelation for convolutive blind source separation
  publication-title: Proceedings of ICA
– year: 2006
  ident: bib9
  article-title: Multimodal speaker localization in a probabilistic framework
  publication-title: Proceedings of EUSIPCO
– volume: 93
  start-page: B69
  year: 2004
  end-page: B78
  ident: bib2
  article-title: Seeing to hear better: evidence for early audio–visual interactions in speech identification
  publication-title: Cognition
– volume: 17
  start-page: 117
  year: 2009
  end-page: 126
  ident: bib32
  article-title: An approach for solving the permutation problem of convolutive blind source separation based on statistical signal models
  publication-title: IEEE Transactions on Audio, Speech and Language Processing
– volume: 125
  start-page: 1184
  year: 2009
  end-page: 1196
  ident: bib34
  article-title: A study of lip movements during spontaneous dialog and its application to voice activity detection
  publication-title: The Journal of the Acoustical Society of America
– volume: 14
  start-page: 2075
  year: 2006
  end-page: 2085
  ident: bib25
  article-title: Blind source separation based on time-domain optimization of a frequency-domain independence criterion
  publication-title: IEEE Transactions on Audio, Speech, and Language Processing
– volume: 36
  start-page: 287
  year: 1994
  end-page: 314
  ident: bib17
  article-title: Independent component analysis, a new concept
  publication-title: Signal Processing
– year: 2009
  ident: bib35
  article-title: A binaural room impulse response database for the evaluation of dereverberation algorithms
  publication-title: 16th International Conference on Digital Signal Processing
– volume: 24
  start-page: 1
  year: 1991
  end-page: 10
  ident: bib18
  article-title: Blind separation of sources. Part i: an adaptive algorithm based on neuromimetic architecture
  publication-title: Signal Processing
– start-page: 425
  year: 2005
  end-page: 428
  ident: bib13
  article-title: Video assisted speech source separation
  publication-title: Proceedings of ICASSP
– volume: 41
  start-page: 3397
  year: 1993
  end-page: 3415
  ident: bib6
  article-title: Matching pursuits with time–frequency dictionaries
  publication-title: IEEE Transactions on Signal Processing
– start-page: 881
  year: 2002
  end-page: 884
  ident: bib27
  article-title: A beamforming approach to permutation alignment for multichannel frequency-domain blind speech separation
  publication-title: Proceedings of ICASSP
– volume: vol. 2
  start-page: 73
  year: 2003
  end-page: 76
  ident: bib30
  article-title: Blind separation of speech mixtures based on nonstationarity
  publication-title: Proceedings of ISSPA
– reference: .
– volume: 9
  start-page: 1483
  year: 1997
  end-page: 1492
  ident: bib22
  article-title: A fast fixed-point algorithm for independent component analysis
  publication-title: Neural Computation
– start-page: 1306
  year: 2003
  end-page: 1326
  ident: bib8
  article-title: Recent advances in the automatic recognition of audio–visual speech
  publication-title: Proceedings of IEEE
– volume: 9
  start-page: 639
  year: 1998
  end-page: 650
  ident: bib7
  article-title: Averaging, maximum penalized likelihood and Bayesian estimation for improving gaussian mixture probability density estimates
  publication-title: IEEE Transactions on Neural Networks
– start-page: 131
  year: 2010
  end-page: 139
  ident: bib15
  article-title: Use of bimodal coherence to resolve spectral indeterminacy in convolutive BSS
  publication-title: Proceedings of LVA/ICA
– volume: 20
  start-page: 1898
  year: 2009
  end-page: 1910
  ident: bib4
  article-title: Learning bimodal structure in audio–visual data
  publication-title: IEEE Transactions on Neural Networks
– reference: XM2VTS, Website.
– volume: 4
  start-page: 895
  year: 2010
  end-page: 910
  ident: bib16
  article-title: A multimodal approach for blind source separation of moving sources
  publication-title: IEEE Journal Selected Topics in Signal Processing
– volume: 13
  start-page: 832
  year: 2005
  end-page: 844
  ident: bib33
  article-title: A frequency domain method for blind source separation of convolutive audio mixtures
  publication-title: IEEE Transactions on Speech and Audio Processing
– volume: vol. 4
  start-page: 2138
  year: 2002
  end-page: 2143
  ident: bib29
  article-title: Minimal distortion principle for blind source separation
  publication-title: Proceedings of SICE
– volume: 16
  start-page: 415
  year: 2006
  end-page: 419
  ident: bib1
  article-title: Seeing sounds: visual and auditory interactions in the brain
  publication-title: IEEE Transactions on Neural Networks
– volume: 14
  start-page: 2075
  issue: 6
  year: 2006
  ident: 10.1016/j.sigpro.2011.11.007_bib25
  article-title: Blind source separation based on time-domain optimization of a frequency-domain independence criterion
  publication-title: IEEE Transactions on Audio, Speech, and Language Processing
  doi: 10.1109/TASL.2006.872623
– year: 2008
  ident: 10.1016/j.sigpro.2011.11.007_bib28
  article-title: A novel robust solution to the permutation problem based on a joint multiple TDOA estimation
– volume: 20
  start-page: 1898
  issue: 12
  year: 2009
  ident: 10.1016/j.sigpro.2011.11.007_bib4
  article-title: Learning bimodal structure in audio–visual data
  publication-title: IEEE Transactions on Neural Networks
  doi: 10.1109/TNN.2009.2032182
– volume: 36
  start-page: 287
  issue: 3
  year: 1994
  ident: 10.1016/j.sigpro.2011.11.007_bib17
  article-title: Independent component analysis, a new concept
  publication-title: Signal Processing
  doi: 10.1016/0165-1684(94)90029-9
– start-page: 438
  year: 2010
  ident: 10.1016/j.sigpro.2011.11.007_bib11
  article-title: Bimodal coherence based scale ambiguity cancellation for target speech extraction and enhancement
– volume: 4
  start-page: 895
  issue: 5
  year: 2010
  ident: 10.1016/j.sigpro.2011.11.007_bib16
  article-title: A multimodal approach for blind source separation of moving sources
  publication-title: IEEE Journal Selected Topics in Signal Processing
  doi: 10.1109/JSTSP.2010.2057198
– start-page: 215
  year: 2000
  ident: 10.1016/j.sigpro.2011.11.007_bib26
  article-title: Amplitude modulation decorrelation for convolutive blind source separation
– volume: 17
  start-page: 117
  issue: 1
  year: 2009
  ident: 10.1016/j.sigpro.2011.11.007_bib32
  article-title: An approach for solving the permutation problem of convolutive blind source separation based on statistical signal models
  publication-title: IEEE Transactions on Audio, Speech and Language Processing
  doi: 10.1109/TASL.2008.2005349
– volume: 125
  start-page: 1184
  issue: 2
  year: 2009
  ident: 10.1016/j.sigpro.2011.11.007_bib34
  article-title: A study of lip movements during spontaneous dialog and its application to voice activity detection
  publication-title: The Journal of the Acoustical Society of America
  doi: 10.1121/1.3050257
– volume: 13
  start-page: 832
  issue: 5
  year: 2005
  ident: 10.1016/j.sigpro.2011.11.007_bib33
  article-title: A frequency domain method for blind source separation of convolutive audio mixtures
  publication-title: IEEE Transactions on Speech and Audio Processing
  doi: 10.1109/TSA.2005.851925
– start-page: 101
  year: 1997
  ident: 10.1016/j.sigpro.2011.11.007_bib23
  article-title: Multichannel blind deconvolution and equalization using the natural gradient
– year: 2006
  ident: 10.1016/j.sigpro.2011.11.007_bib9
  article-title: Multimodal speaker localization in a probabilistic framework
– volume: vol. 4
  start-page: 2138
  year: 2002
  ident: 10.1016/j.sigpro.2011.11.007_bib29
  article-title: Minimal distortion principle for blind source separation
– volume: 12
  start-page: 358
  issue: 5
  year: 2010
  ident: 10.1016/j.sigpro.2011.11.007_bib5
  article-title: Blind audiovisual source separation based on sparse redundant representations
  publication-title: IEEE Transactions on Multimedia
  doi: 10.1109/TMM.2010.2050650
– year: 2010
  ident: 10.1016/j.sigpro.2011.11.007_bib14
  article-title: Audio–visual convolutive blind source separation
– ident: 10.1016/j.sigpro.2011.11.007_bib36
– volume: 45
  start-page: 434
  issue: 2
  year: 1997
  ident: 10.1016/j.sigpro.2011.11.007_bib21
  article-title: A blind source separation technique using second-order statistics
  publication-title: IEEE Transactions on Signal Processing
  doi: 10.1109/78.554307
– start-page: 131
  year: 2010
  ident: 10.1016/j.sigpro.2011.11.007_bib15
  article-title: Use of bimodal coherence to resolve spectral indeterminacy in convolutive BSS
– start-page: 881
  year: 2002
  ident: 10.1016/j.sigpro.2011.11.007_bib27
  article-title: A beamforming approach to permutation alignment for multichannel frequency-domain blind speech separation
– volume: 12
  start-page: 1063
  issue: 5
  year: 2004
  ident: 10.1016/j.sigpro.2011.11.007_bib31
  article-title: A robust and precise method for solving the permutation problem of frequency-domain blind source separation
  publication-title: IEEE Transactions on Speech and Audio Processing
  doi: 10.1109/TSA.2004.832994
– volume: 41
  start-page: 3397
  issue: 12
  year: 1993
  ident: 10.1016/j.sigpro.2011.11.007_bib6
  article-title: Matching pursuits with time–frequency dictionaries
  publication-title: IEEE Transactions on Signal Processing
  doi: 10.1109/78.258082
– volume: 16
  start-page: 415
  issue: 4
  year: 2006
  ident: 10.1016/j.sigpro.2011.11.007_bib1
  article-title: Seeing sounds: visual and auditory interactions in the brain
  publication-title: IEEE Transactions on Neural Networks
– volume: 9
  start-page: 1483
  year: 1997
  ident: 10.1016/j.sigpro.2011.11.007_bib22
  article-title: A fast fixed-point algorithm for independent component analysis
  publication-title: Neural Computation
  doi: 10.1162/neco.1997.9.7.1483
– volume: 9
  start-page: 639
  issue: 4
  year: 1998
  ident: 10.1016/j.sigpro.2011.11.007_bib7
  article-title: Averaging, maximum penalized likelihood and Bayesian estimation for improving gaussian mixture probability density estimates
  publication-title: IEEE Transactions on Neural Networks
  doi: 10.1109/72.701177
– volume: 140
  start-page: 362
  issue: 6
  year: 1993
  ident: 10.1016/j.sigpro.2011.11.007_bib19
  article-title: Blind beamforming for non-Gaussian signals
  publication-title: IEE Proceedings. Part F: Radar and Signal Processing
  doi: 10.1049/ip-f-2.1993.0054
– year: 2009
  ident: 10.1016/j.sigpro.2011.11.007_bib35
  article-title: A binaural room impulse response database for the evaluation of dereverberation algorithms
– start-page: 425
  year: 2005
  ident: 10.1016/j.sigpro.2011.11.007_bib13
  article-title: Video assisted speech source separation
– volume: 2002
  start-page: 1165
  issue: 11
  year: 2002
  ident: 10.1016/j.sigpro.2011.11.007_bib3
  article-title: Separation of audio–visual speech sources: a new approach exploiting the audio–visual coherence of speech stimuli
  publication-title: EURASIP Journal on Applied Signal Processing
  doi: 10.1155/S1110865702207015
– volume: 24
  start-page: 1
  issue: 1
  year: 1991
  ident: 10.1016/j.sigpro.2011.11.007_bib18
  article-title: Blind separation of sources. Part i: an adaptive algorithm based on neuromimetic architecture
  publication-title: Signal Processing
  doi: 10.1016/0165-1684(91)90079-X
– volume: 7
  start-page: 1129
  issue: 6
  year: 1995
  ident: 10.1016/j.sigpro.2011.11.007_bib20
  article-title: An information-maximization approach to blind separation and blind deconvolution
  publication-title: Neural Computation
  doi: 10.1162/neco.1995.7.6.1129
– volume: 109
  start-page: 3007
  issue: 6
  year: 2001
  ident: 10.1016/j.sigpro.2011.11.007_bib10
  article-title: Audio–visual enhancement of speech in noise
  publication-title: The Journal of the Acoustical Society of America
  doi: 10.1121/1.1358887
– volume: 93
  start-page: B69
  year: 2004
  ident: 10.1016/j.sigpro.2011.11.007_bib2
  article-title: Seeing to hear better: evidence for early audio–visual interactions in speech identification
  publication-title: Cognition
  doi: 10.1016/j.cognition.2004.01.006
– start-page: 1306
  year: 2003
  ident: 10.1016/j.sigpro.2011.11.007_bib8
  article-title: Recent advances in the automatic recognition of audio–visual speech
– volume: 13
  start-page: 228
  issue: 4
  year: 2006
  ident: 10.1016/j.sigpro.2011.11.007_bib24
  article-title: Time-domain fast fixed-point algorithms for convolutive ICA
  publication-title: IEEE Signal Processing Letters
  doi: 10.1109/LSP.2005.863638
– volume: 15
  start-page: 96
  issue: 1
  year: 2007
  ident: 10.1016/j.sigpro.2011.11.007_bib12
  article-title: Mixing audiovisual speech processing and blind source separation for the extraction of speech signals from convolutive mixtures
  publication-title: IEEE Transactions on Audio, Speech and Language Processing
  doi: 10.1109/TASL.2006.872619
– volume: vol. 2
  start-page: 73
  year: 2003
  ident: 10.1016/j.sigpro.2011.11.007_bib30
  article-title: Blind separation of speech mixtures based on nonstationarity
SSID ssj0001360
Score 2.0722694
Snippet Recent studies show that facial information contained in visual speech can be helpful for the performance enhancement of audio-only blind source separation...
SourceID unpaywall
proquest
crossref
elsevier
SourceType Open Access Repository
Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1916
SubjectTerms Adapted expectation maximization (AEM)
Algorithms
Audio–visual coherence
Coherence
Convolutive blind source separation (BSS)
Feature selection and fusion
Gaussian mixture model (GMM)
Indeterminacy
Maximization
Permutations
Separation
Speech
Visual
SummonAdditionalLinks – databaseName: Unpaywall
  dbid: UNPAY
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3da9swEBdd-jD20HZfLGMtGuzVSSxZsvTYlZZSaBhkge5J6HPNFuzQ2BvbX79TZJesDLrSN2NOYPl3ug909zuEPhRUE0MnJhOhhASFSJMZTWXmtMw9aIAuWexGvpzy83lxccWudtBJ3wsTyyo7259s-sZad2_G3d8crxaL8Sw24uRcFJH0DLw8f4J2OYOAfIB259NPx18SqzfLokxMu0QJsSRjZd9At6nyWi--gqVKVJ6RzTOOlf23g9oKQJ-21Ur_-qmXyy1fdLaPXL-LVILyfdQ2ZmR_3yF4fOQ2D9BeF6vi4yT3HO346gV6tsVg-BJN52uP64ANQO5A1tbXqX8QNzWGTL5e_oDHa49X4AHadO2PuyE2eFHhWPS-UX4Q-zibvULzs9PPJ-dZN6Ihs1SwJisEl4HJQAx4OZIHI1yQpXSkCJDqWALOTkSSKiuMZswZwjUXgQTu5cS6fEJfo0FVV_4NwtZx7QwVnjJbQFJlKCVUMsul1UIHM0S0x0XZjr88jtFYqr5Q7ZtKaKqIJqQ2CtAcoux21Srxd9wjX_aQq78AUuBi7ln5vtcQBUc03rvoytftWuW8zKkoRcGGaHSrOv_1OW8fuuAdGjQ3rT-ESKkxR91J-AOWehRG
  priority: 102
  providerName: Unpaywall
Title Use of bimodal coherence to resolve the permutation problem in convolutive BSS
URI https://dx.doi.org/10.1016/j.sigpro.2011.11.007
https://www.proquest.com/docview/1671387845
https://www.sciencedirect.com/science/article/pii/S0165168411003926
UnpaywallVersion publishedVersion
Volume 92
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier)
  customDbUrl:
  eissn: 1872-7557
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001360
  issn: 0165-1684
  databaseCode: GBLVA
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection
  customDbUrl:
  eissn: 1872-7557
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001360
  issn: 0165-1684
  databaseCode: .~1
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection
  customDbUrl:
  eissn: 1872-7557
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001360
  issn: 0165-1684
  databaseCode: ACRLP
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: ScienceDirect Journal Collection
  customDbUrl:
  eissn: 1872-7557
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001360
  issn: 0165-1684
  databaseCode: AIKHN
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVLSH
  databaseName: Elsevier Journals
  customDbUrl:
  mediaType: online
  eissn: 1872-7557
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001360
  issn: 0165-1684
  databaseCode: AKRWK
  dateStart: 19930101
  isFulltext: true
  providerName: Library Specific Holdings
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3dSxwxEA-iD7UPRVulVz-I4Ot6t_na5PEU5driIVwP9Ckk2aReOXaP3p3ii3-7k_3QEwRLn3Y3TCDMzM4HmfkNQseMGmJpzyYyZJCgEGUTa6hKcqNSDxpgMh67kS-HYjBmP6759Ro6a3thYlllY_trm15Z62al23CzO5tMuqPYiJMKySLoGXj5CLvNWBanGJw8vpR5pLTqFI7ESaRu2-eqGq_55DfYqRrIM2J5xqGyb7unlfDzw7KYmYd7M52ueKKLLfSpCSFxvz7lNlrzxWf0cQVY8AsajucelwFbkEQOtK68rdv68KLEkGCX0zt4vfV4BoZ5Wd_G42a2DJ4UONaiVzoJZKej0Q4aX5z_OhskzeSExFHJFwmTQgWuArHgfEgarMyDylROWIAMxBHwQTJiRzlpDee5JcIIGUgQXvVcnvboLlovysJ_RdjlwuSWSk-5Y5DrWEoJVdwJ5Yw0wXYQbRmmXQMrHqdbTHVbP_ZH12zWkc2QcWhgcwclz7tmNazGO_RZKwv9Sj00WP53dh61otPw58TrEFP4cjnXqYAEXWaS8Q46eZbpPx3n238fZw9twhepSwj30fri79IfQFizsIeV3h6ijf73n4MhPMfDq_7NE0vd-RU
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LTxsxEB5RONAeUJ9qKFBX6nVJ1l577SMgUKCQS4jEzbK9dgmKdqMmKeqlv73jfUAqIVH1ttodS9bM7Iw_eeYbgK8ZM9SygU1kyBGgUGUTa5hKCqNSjx5gch67ka9GYjjJLm74zQacdL0wsayyjf1NTK-jdfum32qzP59O--PYiJMKmUXSM8zy4gVsZZzmEYEd_n6s80hZ3SocpZMo3vXP1UVei-l3DFQNk2ck84xTZZ_OT2vnz-1VOTe_7s1stpaKzl7DTnuGJEfNNt_Ahi_fwqs1ZsF3MJosPKkCsWiKAmVdddv09ZFlRRBhV7Of-HjryRwj86q5jiftcBkyLUksRq-dEsWOx-P3MDk7vT4ZJu3ohMQxyZdJJoUKXAVqMfvQNFhZBJWrgmYBIYijmIRkJI9y0hrOC0uFETLQILwauCIdsA-wWVal_wjEFcIUlknPuMsQ7FjGKFPcCeWMNMH2gHUK067lFY_jLWa6KyC7042adVQzQg6Nau5B8rBq3vBqPCOfd7bQf_mHxtD_zMovnek0_jrxPsSUvlotdCoQoctcZrwHhw82_aft7P73dj7D9vD66lJfno--fYKX-IU29YR7sLn8sfL7eMZZ2oPah_8AQPD4-g
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3da9swEBdd-jD20HZfLGMtGuzVSSxZsvTYlZZSaBhkge5J6HPNFuzQ2BvbX79TZJesDLrSN2NOYPl3ug909zuEPhRUE0MnJhOhhASFSJMZTWXmtMw9aIAuWexGvpzy83lxccWudtBJ3wsTyyo7259s-sZad2_G3d8crxaL8Sw24uRcFJH0DLw8f4J2OYOAfIB259NPx18SqzfLokxMu0QJsSRjZd9At6nyWi--gqVKVJ6RzTOOlf23g9oKQJ-21Ur_-qmXyy1fdLaPXL-LVILyfdQ2ZmR_3yF4fOQ2D9BeF6vi4yT3HO346gV6tsVg-BJN52uP64ANQO5A1tbXqX8QNzWGTL5e_oDHa49X4AHadO2PuyE2eFHhWPS-UX4Q-zibvULzs9PPJ-dZN6Ihs1SwJisEl4HJQAx4OZIHI1yQpXSkCJDqWALOTkSSKiuMZswZwjUXgQTu5cS6fEJfo0FVV_4NwtZx7QwVnjJbQFJlKCVUMsul1UIHM0S0x0XZjr88jtFYqr5Q7ZtKaKqIJqQ2CtAcoux21Srxd9wjX_aQq78AUuBi7ln5vtcQBUc03rvoytftWuW8zKkoRcGGaHSrOv_1OW8fuuAdGjQ3rT-ESKkxR91J-AOWehRG
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Use+of+bimodal+coherence+to+resolve+the+permutation+problem+in+convolutive+BSS&rft.jtitle=Signal+processing&rft.au=Liu%2C+Qingju&rft.au=Wang%2C+Wenwu&rft.au=Jackson%2C+Philip&rft.date=2012-08-01&rft.pub=Elsevier+B.V&rft.issn=0165-1684&rft.eissn=1872-7557&rft.volume=92&rft.issue=8&rft.spage=1916&rft.epage=1927&rft_id=info:doi/10.1016%2Fj.sigpro.2011.11.007&rft.externalDocID=S0165168411003926
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0165-1684&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0165-1684&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0165-1684&client=summon