Use of bimodal coherence to resolve the permutation problem in convolutive BSS
Recent studies show that facial information contained in visual speech can be helpful for the performance enhancement of audio-only blind source separation (BSS) algorithms. Such information is exploited through the statistical characterization of the coherence between the audio and visual speech us...
Saved in:
| Published in | Signal processing Vol. 92; no. 8; pp. 1916 - 1927 |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
Elsevier B.V
01.08.2012
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 0165-1684 1872-7557 1872-7557 |
| DOI | 10.1016/j.sigpro.2011.11.007 |
Cover
| Abstract | Recent studies show that facial information contained in visual speech can be helpful for the performance enhancement of audio-only blind source separation (BSS) algorithms. Such information is exploited through the statistical characterization of the coherence between the audio and visual speech using, e.g., a Gaussian mixture model (GMM). In this paper, we present three contributions. With the synchronized features, we propose an adapted expectation maximization (AEM) algorithm to model the audio–visual coherence in the off-line training process. To improve the accuracy of this coherence model, we use a frame selection scheme to discard nonstationary features. Then with the coherence maximization technique, we develop a new sorting method to solve the permutation problem in the frequency domain. We test our algorithm on a multimodal speech database composed of different combinations of vowels and consonants. The experimental results show that our proposed algorithm outperforms traditional audio-only BSS, which confirms the benefit of using visual speech to assist in separation of the audio. |
|---|---|
| AbstractList | Recent studies show that facial information contained in visual speech can be helpful for the performance enhancement of audio-only blind source separation (BSS) algorithms. Such information is exploited through the statistical characterization of the coherence between the audio and visual speech using, e.g., a Gaussian mixture model (GMM). In this paper, we present three contributions. With the synchronized features, we propose an adapted expectation maximization (AEM) algorithm to model the audio-visual coherence in the off-line training process. To improve the accuracy of this coherence model, we use a frame selection scheme to discard nonstationary features. Then with the coherence maximization technique, we develop a new sorting method to solve the permutation problem in the frequency domain. We test our algorithm on a multimodal speech database composed of different combinations of vowels and consonants. The experimental results show that our proposed algorithm outperforms traditional audio-only BSS, which confirms the benefit of using visual speech to assist in separation of the audio. |
| Author | Liu, Qingju Jackson, Philip Wang, Wenwu |
| Author_xml | – sequence: 1 givenname: Qingju surname: Liu fullname: Liu, Qingju email: Q.Liu@surrey.ac.uk – sequence: 2 givenname: Wenwu surname: Wang fullname: Wang, Wenwu email: W.Wang@surrey.ac.uk – sequence: 3 givenname: Philip surname: Jackson fullname: Jackson, Philip email: P.Jackson@surrey.ac.uk |
| BookMark | eNqNkctOxCAUhonRxPHyBi66dNMRmHKpCxM13hKjC3VNKD0oE1pGaMf49jKpKxdqchJI-P5z4GMPbfehB4SOCJ4TTPjJcp7c6yqGOcWEzHNhLLbQjEhBS8GY2EazjLGScFntor2UlhhjsuB4hh5eEhTBFo3rQqt9YcIbROgNFEMoIqTg13n7BsUKYjcOenChL_KoxkNXuD7z_Tr4cXAZu3h6OkA7VvsEh9_rPnq5vnq-vC3vH2_uLs_vS7OQbCgryWvLaksbWQlKbCNbW4u6pZUlojKU1rWsBeFGNpqxtqFcc2mp5VBj0xK82Eds6jv2K_35ob1Xq-g6HT8VwWojRS3VJEVtpKhcWUrOHU-5fPA-QhpU55IB73UPYUyKcEEWUsiKZfR0Qk0MKUWwyrjp_UPUzv81p_oR_uf1zqYYZHdrB1El4za_0boIZlBtcL83-ALHvqII |
| CitedBy_id | crossref_primary_10_1109_TASLP_2019_2928140 crossref_primary_10_3389_frobt_2020_00085 crossref_primary_10_1049_iet_spr_2018_5132 crossref_primary_10_1007_s11042_014_2199_4 crossref_primary_10_1109_TSP_2013_2277834 crossref_primary_10_1109_LSP_2018_2853566 crossref_primary_10_1109_MSP_2013_2296173 crossref_primary_10_1186_s13636_019_0164_x crossref_primary_10_1016_j_dsp_2014_04_009 |
| Cites_doi | 10.1109/TASL.2006.872623 10.1109/TNN.2009.2032182 10.1016/0165-1684(94)90029-9 10.1109/JSTSP.2010.2057198 10.1109/TASL.2008.2005349 10.1121/1.3050257 10.1109/TSA.2005.851925 10.1109/TMM.2010.2050650 10.1109/78.554307 10.1109/TSA.2004.832994 10.1109/78.258082 10.1162/neco.1997.9.7.1483 10.1109/72.701177 10.1049/ip-f-2.1993.0054 10.1155/S1110865702207015 10.1016/0165-1684(91)90079-X 10.1162/neco.1995.7.6.1129 10.1121/1.1358887 10.1016/j.cognition.2004.01.006 10.1109/LSP.2005.863638 10.1109/TASL.2006.872619 |
| ContentType | Journal Article |
| Copyright | 2011 Elsevier B.V. |
| Copyright_xml | – notice: 2011 Elsevier B.V. |
| DBID | AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D ADTOC UNPAY |
| DOI | 10.1016/j.sigpro.2011.11.007 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Unpaywall for CDI: Periodical Content Unpaywall |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1872-7557 |
| EndPage | 1927 |
| ExternalDocumentID | 10.1016/j.sigpro.2011.11.007 10_1016_j_sigpro_2011_11_007 S0165168411003926 |
| GroupedDBID | --K --M -~X .DC .~1 0R~ 123 1B1 1~. 1~5 4.4 457 4G. 53G 5VS 7-5 71M 8P~ 9JN AABNK AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN ABBOA ABFNM ABFRF ABMAC ABXDB ABYKQ ACDAQ ACGFO ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADJOM ADMUD ADTZH AEBSH AECPX AEFWE AEKER AENEX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F0J F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-Q G8K GBLVA GBOLZ HLZ HVGLF HZ~ IHE J1W JJJVA KOM LG9 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 R2- RIG ROL RPZ SBC SDF SDG SDP SES SEW SPC SPCBC SST SSV SSZ T5K TAE TN5 WUQ XPP ZMT ~02 ~G- AATTM AAXKI AAYWO AAYXX ABDPE ABJNI ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD 7SC 7SP 8FD JQ2 L7M L~C L~D ADTOC AGCQF UNPAY |
| ID | FETCH-LOGICAL-c385t-4869f59f2b84721fb8df979d24f174c229989716c8ba55db26a68f2f6e90cd103 |
| IEDL.DBID | .~1 |
| ISSN | 0165-1684 1872-7557 |
| IngestDate | Tue Aug 19 17:34:04 EDT 2025 Sat Sep 27 22:44:10 EDT 2025 Wed Oct 01 02:46:25 EDT 2025 Thu Apr 24 23:01:42 EDT 2025 Fri Feb 23 02:28:10 EST 2024 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 8 |
| Keywords | Gaussian mixture model (GMM) Audio–visual coherence Adapted expectation maximization (AEM) Indeterminacy Feature selection and fusion Convolutive blind source separation (BSS) |
| Language | English |
| License | https://www.elsevier.com/tdm/userlicense/1.0 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c385t-4869f59f2b84721fb8df979d24f174c229989716c8ba55db26a68f2f6e90cd103 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| OpenAccessLink | https://proxy.k.utb.cz/login?url=https://www.sciencedirect.com/science/article/pii/S0165168411003926 |
| PQID | 1671387845 |
| PQPubID | 23500 |
| PageCount | 12 |
| ParticipantIDs | unpaywall_primary_10_1016_j_sigpro_2011_11_007 proquest_miscellaneous_1671387845 crossref_citationtrail_10_1016_j_sigpro_2011_11_007 crossref_primary_10_1016_j_sigpro_2011_11_007 elsevier_sciencedirect_doi_10_1016_j_sigpro_2011_11_007 |
| ProviderPackageCode | CITATION AAYXX |
| PublicationCentury | 2000 |
| PublicationDate | August 2012 2012-8-00 20120801 |
| PublicationDateYYYYMMDD | 2012-08-01 |
| PublicationDate_xml | – month: 08 year: 2012 text: August 2012 |
| PublicationDecade | 2010 |
| PublicationTitle | Signal processing |
| PublicationYear | 2012 |
| Publisher | Elsevier B.V |
| Publisher_xml | – name: Elsevier B.V |
| References | Sodoyer, Rivet, Girin, Savariaux, Schwartz, Jutten (bib34) 2009; 125 Girin, Schwartz, Feng (bib10) 2001; 109 Nesta, Omologo, Svaizer (bib28) 2008 Ormoneit, Tresp (bib7) 1998; 9 Potamianos, Neti, Gravier, Garg, Senior (bib8) 2003 Naqvi, Yu, Chambers (bib16) 2010; 4 Rivet, Girin, Jutten (bib12) 2007; 15 Belouchrani, Abed-Meraim, Cardoso, Moulines (bib21) 1997; 45 Liu, Wang, Jackson (bib15) 2010 Rahbar, Peilly (bib33) 2005; 13 Schwartz, Berthommier, Savariaux (bib2) 2004; 93 Amari, Douglas, Cichocki, Yang (bib23) 1997 Thomas, Deville, Hosseini (bib24) 2006; 13 Bulkin, Groh (bib1) 2006; 16 Hyvärinen, Oja (bib22) 1997; 9 Cardoso, Souloumiac (bib19) 1993; 140 Wang, Cosker, Hicks, Sanei, Chambers (bib13) 2005 Mazur, Mertins (bib32) 2009; 17 Mei, Xi, Yin, Mertins, Chicharo (bib25) 2006; 14 Gurban (bib9) 2006 Comon (bib17) 1994; 36 Jeub, Schafer, Vary (bib35) 2009 Bell, Sejnowski (bib20) 1995; 7 Matsuoka (bib29) 2002; vol. 4 Pham, Servière, Boumaraf (bib30) 2003; vol. 2 XM2VTS, Website. . Liu, Wang, Jackson (bib11) 2010 Liu, Wang, Jackson (bib14) 2010 Mallat, Zhang (bib6) 1993; 41 Jutten, Herault (bib18) 1991; 24 Anemüller, Kollmeier (bib26) 2000 Sawada, Mukai, Araki, Makino (bib31) 2004; 12 Sodoyer, Schwartz, Girin, Klinkisch, Jutten (bib3) 2002; 2002 Monaci, Vandergheynst, Sommer (bib4) 2009; 20 Casanovas, Monaci, Vandergheynst, Gribonval (bib5) 2010; 12 Ikram, Morgan (bib27) 2002 Mei (10.1016/j.sigpro.2011.11.007_bib25) 2006; 14 Liu (10.1016/j.sigpro.2011.11.007_bib11) 2010 Sodoyer (10.1016/j.sigpro.2011.11.007_bib34) 2009; 125 Girin (10.1016/j.sigpro.2011.11.007_bib10) 2001; 109 Mallat (10.1016/j.sigpro.2011.11.007_bib6) 1993; 41 Ormoneit (10.1016/j.sigpro.2011.11.007_bib7) 1998; 9 Matsuoka (10.1016/j.sigpro.2011.11.007_bib29) 2002; vol. 4 Ikram (10.1016/j.sigpro.2011.11.007_bib27) 2002 Wang (10.1016/j.sigpro.2011.11.007_bib13) 2005 Jeub (10.1016/j.sigpro.2011.11.007_bib35) 2009 Naqvi (10.1016/j.sigpro.2011.11.007_bib16) 2010; 4 Liu (10.1016/j.sigpro.2011.11.007_bib14) 2010 Cardoso (10.1016/j.sigpro.2011.11.007_bib19) 1993; 140 Jutten (10.1016/j.sigpro.2011.11.007_bib18) 1991; 24 Bulkin (10.1016/j.sigpro.2011.11.007_bib1) 2006; 16 Sodoyer (10.1016/j.sigpro.2011.11.007_bib3) 2002; 2002 Amari (10.1016/j.sigpro.2011.11.007_bib23) 1997 Thomas (10.1016/j.sigpro.2011.11.007_bib24) 2006; 13 Potamianos (10.1016/j.sigpro.2011.11.007_bib8) 2003 Sawada (10.1016/j.sigpro.2011.11.007_bib31) 2004; 12 Hyvärinen (10.1016/j.sigpro.2011.11.007_bib22) 1997; 9 Rahbar (10.1016/j.sigpro.2011.11.007_bib33) 2005; 13 Monaci (10.1016/j.sigpro.2011.11.007_bib4) 2009; 20 Mazur (10.1016/j.sigpro.2011.11.007_bib32) 2009; 17 Anemüller (10.1016/j.sigpro.2011.11.007_bib26) 2000 Liu (10.1016/j.sigpro.2011.11.007_bib15) 2010 Gurban (10.1016/j.sigpro.2011.11.007_bib9) 2006 Pham (10.1016/j.sigpro.2011.11.007_bib30) 2003; vol. 2 10.1016/j.sigpro.2011.11.007_bib36 Bell (10.1016/j.sigpro.2011.11.007_bib20) 1995; 7 Schwartz (10.1016/j.sigpro.2011.11.007_bib2) 2004; 93 Casanovas (10.1016/j.sigpro.2011.11.007_bib5) 2010; 12 Rivet (10.1016/j.sigpro.2011.11.007_bib12) 2007; 15 Comon (10.1016/j.sigpro.2011.11.007_bib17) 1994; 36 Nesta (10.1016/j.sigpro.2011.11.007_bib28) 2008 Belouchrani (10.1016/j.sigpro.2011.11.007_bib21) 1997; 45 |
| References_xml | – volume: 45 start-page: 434 year: 1997 end-page: 444 ident: bib21 article-title: A blind source separation technique using second-order statistics publication-title: IEEE Transactions on Signal Processing – volume: 140 start-page: 362 year: 1993 end-page: 370 ident: bib19 article-title: Blind beamforming for non-Gaussian signals publication-title: IEE Proceedings. Part F: Radar and Signal Processing – volume: 109 start-page: 3007 year: 2001 end-page: 3020 ident: bib10 article-title: Audio–visual enhancement of speech in noise publication-title: The Journal of the Acoustical Society of America – volume: 12 start-page: 358 year: 2010 end-page: 371 ident: bib5 article-title: Blind audiovisual source separation based on sparse redundant representations publication-title: IEEE Transactions on Multimedia – volume: 13 start-page: 228 year: 2006 end-page: 231 ident: bib24 article-title: Time-domain fast fixed-point algorithms for convolutive ICA publication-title: IEEE Signal Processing Letters – year: 2008 ident: bib28 article-title: A novel robust solution to the permutation problem based on a joint multiple TDOA estimation publication-title: Proceedings of IWAENC – start-page: 101 year: 1997 end-page: 104 ident: bib23 article-title: Multichannel blind deconvolution and equalization using the natural gradient publication-title: Proceedings of IEEE International Workshop on Wireless Communication – volume: 15 start-page: 96 year: 2007 end-page: 108 ident: bib12 article-title: Mixing audiovisual speech processing and blind source separation for the extraction of speech signals from convolutive mixtures publication-title: IEEE Transactions on Audio, Speech and Language Processing – volume: 7 start-page: 1129 year: 1995 end-page: 1159 ident: bib20 article-title: An information-maximization approach to blind separation and blind deconvolution publication-title: Neural Computation – volume: 2002 start-page: 1165 year: 2002 end-page: 1173 ident: bib3 article-title: Separation of audio–visual speech sources: a new approach exploiting the audio–visual coherence of speech stimuli publication-title: EURASIP Journal on Applied Signal Processing – start-page: 438 year: 2010 end-page: 441 ident: bib11 article-title: Bimodal coherence based scale ambiguity cancellation for target speech extraction and enhancement publication-title: Proceedings of Interspeech – volume: 12 start-page: 1063 year: 2004 end-page: 6676 ident: bib31 article-title: A robust and precise method for solving the permutation problem of frequency-domain blind source separation publication-title: IEEE Transactions on Speech and Audio Processing – year: 2010 ident: bib14 article-title: Audio–visual convolutive blind source separation publication-title: Proceedings of Sensor Signal Processing for Defence (SSPD) – start-page: 215 year: 2000 end-page: 220 ident: bib26 article-title: Amplitude modulation decorrelation for convolutive blind source separation publication-title: Proceedings of ICA – year: 2006 ident: bib9 article-title: Multimodal speaker localization in a probabilistic framework publication-title: Proceedings of EUSIPCO – volume: 93 start-page: B69 year: 2004 end-page: B78 ident: bib2 article-title: Seeing to hear better: evidence for early audio–visual interactions in speech identification publication-title: Cognition – volume: 17 start-page: 117 year: 2009 end-page: 126 ident: bib32 article-title: An approach for solving the permutation problem of convolutive blind source separation based on statistical signal models publication-title: IEEE Transactions on Audio, Speech and Language Processing – volume: 125 start-page: 1184 year: 2009 end-page: 1196 ident: bib34 article-title: A study of lip movements during spontaneous dialog and its application to voice activity detection publication-title: The Journal of the Acoustical Society of America – volume: 14 start-page: 2075 year: 2006 end-page: 2085 ident: bib25 article-title: Blind source separation based on time-domain optimization of a frequency-domain independence criterion publication-title: IEEE Transactions on Audio, Speech, and Language Processing – volume: 36 start-page: 287 year: 1994 end-page: 314 ident: bib17 article-title: Independent component analysis, a new concept publication-title: Signal Processing – year: 2009 ident: bib35 article-title: A binaural room impulse response database for the evaluation of dereverberation algorithms publication-title: 16th International Conference on Digital Signal Processing – volume: 24 start-page: 1 year: 1991 end-page: 10 ident: bib18 article-title: Blind separation of sources. Part i: an adaptive algorithm based on neuromimetic architecture publication-title: Signal Processing – start-page: 425 year: 2005 end-page: 428 ident: bib13 article-title: Video assisted speech source separation publication-title: Proceedings of ICASSP – volume: 41 start-page: 3397 year: 1993 end-page: 3415 ident: bib6 article-title: Matching pursuits with time–frequency dictionaries publication-title: IEEE Transactions on Signal Processing – start-page: 881 year: 2002 end-page: 884 ident: bib27 article-title: A beamforming approach to permutation alignment for multichannel frequency-domain blind speech separation publication-title: Proceedings of ICASSP – volume: vol. 2 start-page: 73 year: 2003 end-page: 76 ident: bib30 article-title: Blind separation of speech mixtures based on nonstationarity publication-title: Proceedings of ISSPA – reference: . – volume: 9 start-page: 1483 year: 1997 end-page: 1492 ident: bib22 article-title: A fast fixed-point algorithm for independent component analysis publication-title: Neural Computation – start-page: 1306 year: 2003 end-page: 1326 ident: bib8 article-title: Recent advances in the automatic recognition of audio–visual speech publication-title: Proceedings of IEEE – volume: 9 start-page: 639 year: 1998 end-page: 650 ident: bib7 article-title: Averaging, maximum penalized likelihood and Bayesian estimation for improving gaussian mixture probability density estimates publication-title: IEEE Transactions on Neural Networks – start-page: 131 year: 2010 end-page: 139 ident: bib15 article-title: Use of bimodal coherence to resolve spectral indeterminacy in convolutive BSS publication-title: Proceedings of LVA/ICA – volume: 20 start-page: 1898 year: 2009 end-page: 1910 ident: bib4 article-title: Learning bimodal structure in audio–visual data publication-title: IEEE Transactions on Neural Networks – reference: XM2VTS, Website. – volume: 4 start-page: 895 year: 2010 end-page: 910 ident: bib16 article-title: A multimodal approach for blind source separation of moving sources publication-title: IEEE Journal Selected Topics in Signal Processing – volume: 13 start-page: 832 year: 2005 end-page: 844 ident: bib33 article-title: A frequency domain method for blind source separation of convolutive audio mixtures publication-title: IEEE Transactions on Speech and Audio Processing – volume: vol. 4 start-page: 2138 year: 2002 end-page: 2143 ident: bib29 article-title: Minimal distortion principle for blind source separation publication-title: Proceedings of SICE – volume: 16 start-page: 415 year: 2006 end-page: 419 ident: bib1 article-title: Seeing sounds: visual and auditory interactions in the brain publication-title: IEEE Transactions on Neural Networks – volume: 14 start-page: 2075 issue: 6 year: 2006 ident: 10.1016/j.sigpro.2011.11.007_bib25 article-title: Blind source separation based on time-domain optimization of a frequency-domain independence criterion publication-title: IEEE Transactions on Audio, Speech, and Language Processing doi: 10.1109/TASL.2006.872623 – year: 2008 ident: 10.1016/j.sigpro.2011.11.007_bib28 article-title: A novel robust solution to the permutation problem based on a joint multiple TDOA estimation – volume: 20 start-page: 1898 issue: 12 year: 2009 ident: 10.1016/j.sigpro.2011.11.007_bib4 article-title: Learning bimodal structure in audio–visual data publication-title: IEEE Transactions on Neural Networks doi: 10.1109/TNN.2009.2032182 – volume: 36 start-page: 287 issue: 3 year: 1994 ident: 10.1016/j.sigpro.2011.11.007_bib17 article-title: Independent component analysis, a new concept publication-title: Signal Processing doi: 10.1016/0165-1684(94)90029-9 – start-page: 438 year: 2010 ident: 10.1016/j.sigpro.2011.11.007_bib11 article-title: Bimodal coherence based scale ambiguity cancellation for target speech extraction and enhancement – volume: 4 start-page: 895 issue: 5 year: 2010 ident: 10.1016/j.sigpro.2011.11.007_bib16 article-title: A multimodal approach for blind source separation of moving sources publication-title: IEEE Journal Selected Topics in Signal Processing doi: 10.1109/JSTSP.2010.2057198 – start-page: 215 year: 2000 ident: 10.1016/j.sigpro.2011.11.007_bib26 article-title: Amplitude modulation decorrelation for convolutive blind source separation – volume: 17 start-page: 117 issue: 1 year: 2009 ident: 10.1016/j.sigpro.2011.11.007_bib32 article-title: An approach for solving the permutation problem of convolutive blind source separation based on statistical signal models publication-title: IEEE Transactions on Audio, Speech and Language Processing doi: 10.1109/TASL.2008.2005349 – volume: 125 start-page: 1184 issue: 2 year: 2009 ident: 10.1016/j.sigpro.2011.11.007_bib34 article-title: A study of lip movements during spontaneous dialog and its application to voice activity detection publication-title: The Journal of the Acoustical Society of America doi: 10.1121/1.3050257 – volume: 13 start-page: 832 issue: 5 year: 2005 ident: 10.1016/j.sigpro.2011.11.007_bib33 article-title: A frequency domain method for blind source separation of convolutive audio mixtures publication-title: IEEE Transactions on Speech and Audio Processing doi: 10.1109/TSA.2005.851925 – start-page: 101 year: 1997 ident: 10.1016/j.sigpro.2011.11.007_bib23 article-title: Multichannel blind deconvolution and equalization using the natural gradient – year: 2006 ident: 10.1016/j.sigpro.2011.11.007_bib9 article-title: Multimodal speaker localization in a probabilistic framework – volume: vol. 4 start-page: 2138 year: 2002 ident: 10.1016/j.sigpro.2011.11.007_bib29 article-title: Minimal distortion principle for blind source separation – volume: 12 start-page: 358 issue: 5 year: 2010 ident: 10.1016/j.sigpro.2011.11.007_bib5 article-title: Blind audiovisual source separation based on sparse redundant representations publication-title: IEEE Transactions on Multimedia doi: 10.1109/TMM.2010.2050650 – year: 2010 ident: 10.1016/j.sigpro.2011.11.007_bib14 article-title: Audio–visual convolutive blind source separation – ident: 10.1016/j.sigpro.2011.11.007_bib36 – volume: 45 start-page: 434 issue: 2 year: 1997 ident: 10.1016/j.sigpro.2011.11.007_bib21 article-title: A blind source separation technique using second-order statistics publication-title: IEEE Transactions on Signal Processing doi: 10.1109/78.554307 – start-page: 131 year: 2010 ident: 10.1016/j.sigpro.2011.11.007_bib15 article-title: Use of bimodal coherence to resolve spectral indeterminacy in convolutive BSS – start-page: 881 year: 2002 ident: 10.1016/j.sigpro.2011.11.007_bib27 article-title: A beamforming approach to permutation alignment for multichannel frequency-domain blind speech separation – volume: 12 start-page: 1063 issue: 5 year: 2004 ident: 10.1016/j.sigpro.2011.11.007_bib31 article-title: A robust and precise method for solving the permutation problem of frequency-domain blind source separation publication-title: IEEE Transactions on Speech and Audio Processing doi: 10.1109/TSA.2004.832994 – volume: 41 start-page: 3397 issue: 12 year: 1993 ident: 10.1016/j.sigpro.2011.11.007_bib6 article-title: Matching pursuits with time–frequency dictionaries publication-title: IEEE Transactions on Signal Processing doi: 10.1109/78.258082 – volume: 16 start-page: 415 issue: 4 year: 2006 ident: 10.1016/j.sigpro.2011.11.007_bib1 article-title: Seeing sounds: visual and auditory interactions in the brain publication-title: IEEE Transactions on Neural Networks – volume: 9 start-page: 1483 year: 1997 ident: 10.1016/j.sigpro.2011.11.007_bib22 article-title: A fast fixed-point algorithm for independent component analysis publication-title: Neural Computation doi: 10.1162/neco.1997.9.7.1483 – volume: 9 start-page: 639 issue: 4 year: 1998 ident: 10.1016/j.sigpro.2011.11.007_bib7 article-title: Averaging, maximum penalized likelihood and Bayesian estimation for improving gaussian mixture probability density estimates publication-title: IEEE Transactions on Neural Networks doi: 10.1109/72.701177 – volume: 140 start-page: 362 issue: 6 year: 1993 ident: 10.1016/j.sigpro.2011.11.007_bib19 article-title: Blind beamforming for non-Gaussian signals publication-title: IEE Proceedings. Part F: Radar and Signal Processing doi: 10.1049/ip-f-2.1993.0054 – year: 2009 ident: 10.1016/j.sigpro.2011.11.007_bib35 article-title: A binaural room impulse response database for the evaluation of dereverberation algorithms – start-page: 425 year: 2005 ident: 10.1016/j.sigpro.2011.11.007_bib13 article-title: Video assisted speech source separation – volume: 2002 start-page: 1165 issue: 11 year: 2002 ident: 10.1016/j.sigpro.2011.11.007_bib3 article-title: Separation of audio–visual speech sources: a new approach exploiting the audio–visual coherence of speech stimuli publication-title: EURASIP Journal on Applied Signal Processing doi: 10.1155/S1110865702207015 – volume: 24 start-page: 1 issue: 1 year: 1991 ident: 10.1016/j.sigpro.2011.11.007_bib18 article-title: Blind separation of sources. Part i: an adaptive algorithm based on neuromimetic architecture publication-title: Signal Processing doi: 10.1016/0165-1684(91)90079-X – volume: 7 start-page: 1129 issue: 6 year: 1995 ident: 10.1016/j.sigpro.2011.11.007_bib20 article-title: An information-maximization approach to blind separation and blind deconvolution publication-title: Neural Computation doi: 10.1162/neco.1995.7.6.1129 – volume: 109 start-page: 3007 issue: 6 year: 2001 ident: 10.1016/j.sigpro.2011.11.007_bib10 article-title: Audio–visual enhancement of speech in noise publication-title: The Journal of the Acoustical Society of America doi: 10.1121/1.1358887 – volume: 93 start-page: B69 year: 2004 ident: 10.1016/j.sigpro.2011.11.007_bib2 article-title: Seeing to hear better: evidence for early audio–visual interactions in speech identification publication-title: Cognition doi: 10.1016/j.cognition.2004.01.006 – start-page: 1306 year: 2003 ident: 10.1016/j.sigpro.2011.11.007_bib8 article-title: Recent advances in the automatic recognition of audio–visual speech – volume: 13 start-page: 228 issue: 4 year: 2006 ident: 10.1016/j.sigpro.2011.11.007_bib24 article-title: Time-domain fast fixed-point algorithms for convolutive ICA publication-title: IEEE Signal Processing Letters doi: 10.1109/LSP.2005.863638 – volume: 15 start-page: 96 issue: 1 year: 2007 ident: 10.1016/j.sigpro.2011.11.007_bib12 article-title: Mixing audiovisual speech processing and blind source separation for the extraction of speech signals from convolutive mixtures publication-title: IEEE Transactions on Audio, Speech and Language Processing doi: 10.1109/TASL.2006.872619 – volume: vol. 2 start-page: 73 year: 2003 ident: 10.1016/j.sigpro.2011.11.007_bib30 article-title: Blind separation of speech mixtures based on nonstationarity |
| SSID | ssj0001360 |
| Score | 2.0722694 |
| Snippet | Recent studies show that facial information contained in visual speech can be helpful for the performance enhancement of audio-only blind source separation... |
| SourceID | unpaywall proquest crossref elsevier |
| SourceType | Open Access Repository Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 1916 |
| SubjectTerms | Adapted expectation maximization (AEM) Algorithms Audio–visual coherence Coherence Convolutive blind source separation (BSS) Feature selection and fusion Gaussian mixture model (GMM) Indeterminacy Maximization Permutations Separation Speech Visual |
| SummonAdditionalLinks | – databaseName: Unpaywall dbid: UNPAY link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3da9swEBdd-jD20HZfLGMtGuzVSSxZsvTYlZZSaBhkge5J6HPNFuzQ2BvbX79TZJesDLrSN2NOYPl3ug909zuEPhRUE0MnJhOhhASFSJMZTWXmtMw9aIAuWexGvpzy83lxccWudtBJ3wsTyyo7259s-sZad2_G3d8crxaL8Sw24uRcFJH0DLw8f4J2OYOAfIB259NPx18SqzfLokxMu0QJsSRjZd9At6nyWi--gqVKVJ6RzTOOlf23g9oKQJ-21Ur_-qmXyy1fdLaPXL-LVILyfdQ2ZmR_3yF4fOQ2D9BeF6vi4yT3HO346gV6tsVg-BJN52uP64ANQO5A1tbXqX8QNzWGTL5e_oDHa49X4AHadO2PuyE2eFHhWPS-UX4Q-zibvULzs9PPJ-dZN6Ihs1SwJisEl4HJQAx4OZIHI1yQpXSkCJDqWALOTkSSKiuMZswZwjUXgQTu5cS6fEJfo0FVV_4NwtZx7QwVnjJbQFJlKCVUMsul1UIHM0S0x0XZjr88jtFYqr5Q7ZtKaKqIJqQ2CtAcoux21Srxd9wjX_aQq78AUuBi7ln5vtcQBUc03rvoytftWuW8zKkoRcGGaHSrOv_1OW8fuuAdGjQ3rT-ESKkxR91J-AOWehRG priority: 102 providerName: Unpaywall |
| Title | Use of bimodal coherence to resolve the permutation problem in convolutive BSS |
| URI | https://dx.doi.org/10.1016/j.sigpro.2011.11.007 https://www.proquest.com/docview/1671387845 https://www.sciencedirect.com/science/article/pii/S0165168411003926 |
| UnpaywallVersion | publishedVersion |
| Volume | 92 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier) customDbUrl: eissn: 1872-7557 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0001360 issn: 0165-1684 databaseCode: GBLVA dateStart: 20110101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection customDbUrl: eissn: 1872-7557 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0001360 issn: 0165-1684 databaseCode: .~1 dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection customDbUrl: eissn: 1872-7557 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0001360 issn: 0165-1684 databaseCode: ACRLP dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: ScienceDirect Journal Collection customDbUrl: eissn: 1872-7557 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0001360 issn: 0165-1684 databaseCode: AIKHN dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVLSH databaseName: Elsevier Journals customDbUrl: mediaType: online eissn: 1872-7557 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0001360 issn: 0165-1684 databaseCode: AKRWK dateStart: 19930101 isFulltext: true providerName: Library Specific Holdings |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3dSxwxEA-iD7UPRVulVz-I4Ot6t_na5PEU5driIVwP9Ckk2aReOXaP3p3ii3-7k_3QEwRLn3Y3TCDMzM4HmfkNQseMGmJpzyYyZJCgEGUTa6hKcqNSDxpgMh67kS-HYjBmP6759Ro6a3thYlllY_trm15Z62al23CzO5tMuqPYiJMKySLoGXj5CLvNWBanGJw8vpR5pLTqFI7ESaRu2-eqGq_55DfYqRrIM2J5xqGyb7unlfDzw7KYmYd7M52ueKKLLfSpCSFxvz7lNlrzxWf0cQVY8AsajucelwFbkEQOtK68rdv68KLEkGCX0zt4vfV4BoZ5Wd_G42a2DJ4UONaiVzoJZKej0Q4aX5z_OhskzeSExFHJFwmTQgWuArHgfEgarMyDylROWIAMxBHwQTJiRzlpDee5JcIIGUgQXvVcnvboLlovysJ_RdjlwuSWSk-5Y5DrWEoJVdwJ5Yw0wXYQbRmmXQMrHqdbTHVbP_ZH12zWkc2QcWhgcwclz7tmNazGO_RZKwv9Sj00WP53dh61otPw58TrEFP4cjnXqYAEXWaS8Q46eZbpPx3n238fZw9twhepSwj30fri79IfQFizsIeV3h6ijf73n4MhPMfDq_7NE0vd-RU |
| linkProvider | Elsevier |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LTxsxEB5RONAeUJ9qKFBX6nVJ1l577SMgUKCQS4jEzbK9dgmKdqMmKeqlv73jfUAqIVH1ttodS9bM7Iw_eeYbgK8ZM9SygU1kyBGgUGUTa5hKCqNSjx5gch67ka9GYjjJLm74zQacdL0wsayyjf1NTK-jdfum32qzP59O--PYiJMKmUXSM8zy4gVsZZzmEYEd_n6s80hZ3SocpZMo3vXP1UVei-l3DFQNk2ck84xTZZ_OT2vnz-1VOTe_7s1stpaKzl7DTnuGJEfNNt_Ahi_fwqs1ZsF3MJosPKkCsWiKAmVdddv09ZFlRRBhV7Of-HjryRwj86q5jiftcBkyLUksRq-dEsWOx-P3MDk7vT4ZJu3ohMQxyZdJJoUKXAVqMfvQNFhZBJWrgmYBIYijmIRkJI9y0hrOC0uFETLQILwauCIdsA-wWVal_wjEFcIUlknPuMsQ7FjGKFPcCeWMNMH2gHUK067lFY_jLWa6KyC7042adVQzQg6Nau5B8rBq3vBqPCOfd7bQf_mHxtD_zMovnek0_jrxPsSUvlotdCoQoctcZrwHhw82_aft7P73dj7D9vD66lJfno--fYKX-IU29YR7sLn8sfL7eMZZ2oPah_8AQPD4-g |
| linkToUnpaywall | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3da9swEBdd-jD20HZfLGMtGuzVSSxZsvTYlZZSaBhkge5J6HPNFuzQ2BvbX79TZJesDLrSN2NOYPl3ug909zuEPhRUE0MnJhOhhASFSJMZTWXmtMw9aIAuWexGvpzy83lxccWudtBJ3wsTyyo7259s-sZad2_G3d8crxaL8Sw24uRcFJH0DLw8f4J2OYOAfIB259NPx18SqzfLokxMu0QJsSRjZd9At6nyWi--gqVKVJ6RzTOOlf23g9oKQJ-21Ur_-qmXyy1fdLaPXL-LVILyfdQ2ZmR_3yF4fOQ2D9BeF6vi4yT3HO346gV6tsVg-BJN52uP64ANQO5A1tbXqX8QNzWGTL5e_oDHa49X4AHadO2PuyE2eFHhWPS-UX4Q-zibvULzs9PPJ-dZN6Ihs1SwJisEl4HJQAx4OZIHI1yQpXSkCJDqWALOTkSSKiuMZswZwjUXgQTu5cS6fEJfo0FVV_4NwtZx7QwVnjJbQFJlKCVUMsul1UIHM0S0x0XZjr88jtFYqr5Q7ZtKaKqIJqQ2CtAcoux21Srxd9wjX_aQq78AUuBi7ln5vtcQBUc03rvoytftWuW8zKkoRcGGaHSrOv_1OW8fuuAdGjQ3rT-ESKkxR91J-AOWehRG |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Use+of+bimodal+coherence+to+resolve+the+permutation+problem+in+convolutive+BSS&rft.jtitle=Signal+processing&rft.au=Liu%2C+Qingju&rft.au=Wang%2C+Wenwu&rft.au=Jackson%2C+Philip&rft.date=2012-08-01&rft.pub=Elsevier+B.V&rft.issn=0165-1684&rft.eissn=1872-7557&rft.volume=92&rft.issue=8&rft.spage=1916&rft.epage=1927&rft_id=info:doi/10.1016%2Fj.sigpro.2011.11.007&rft.externalDocID=S0165168411003926 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0165-1684&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0165-1684&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0165-1684&client=summon |