Using deformable templates to infer visual speech dynamics
The visual image of a talker provides information complementary to the acoustic speech waveform, and enables improved recognition accuracy, especially in environments corrupted by high acoustic noise or multiple talkers. Because most of the phonologically relevant visual information is from the mout...
        Saved in:
      
    
          | Published in | Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers Vol. 1; pp. 578 - 582 vol.1 | 
|---|---|
| Main Authors | , , | 
| Format | Conference Proceeding | 
| Language | English | 
| Published | 
            IEEE Comput. Soc. Press
    
        1994
     | 
| Subjects | |
| Online Access | Get full text | 
| ISBN | 0818664053 9780818664052  | 
| ISSN | 1058-6393 | 
| DOI | 10.1109/ACSSC.1994.471518 | 
Cover
| Abstract | The visual image of a talker provides information complementary to the acoustic speech waveform, and enables improved recognition accuracy, especially in environments corrupted by high acoustic noise or multiple talkers. Because most of the phonologically relevant visual information is from the mouth and lips, it is important to infer accurately and robustly their dynamics; moreover it is desirable to extract this information without the use of invasive markers or patterned illumination. We describe the use of deformable templates for speechreading, in order to infer the dynamics of lip contours throughout an image sequence. Template computations can be done relatively quickly and the resulting small number of shape description parameters are quite robust to visual noise and variations in illumination. Such templates delineate the inside of the mouth, so that the teeth and the tongue can also be found.< > | 
    
|---|---|
| AbstractList | The visual image of a talker provides information complementary to the acoustic speech waveform, and enables improved recognition accuracy, especially in environments corrupted by high acoustic noise or multiple talkers. Because most of the phonologically relevant visual information is from the mouth and lips, it is important to infer accurately and robustly their dynamics; moreover it is desirable to extract this information without the use of invasive markers or patterned illumination. We describe the use of deformable templates for speechreading, in order to infer the dynamics of lip contours throughout an image sequence. Template computations can be done relatively quickly and the resulting small number of shape description parameters are quite robust to visual noise and variations in illumination. Such templates delineate the inside of the mouth, so that the teeth and the tongue can also be found.< > | 
    
| Author | Hennecke, M.E. Stork, D.G. Prasad, K.V.  | 
    
| Author_xml | – sequence: 1 givenname: M.E. surname: Hennecke fullname: Hennecke, M.E. organization: Dept. of Electr. Eng., Stanford Univ., CA, USA – sequence: 2 givenname: K.V. surname: Prasad fullname: Prasad, K.V. organization: Dept. of Electr. Eng., Stanford Univ., CA, USA – sequence: 3 givenname: D.G. surname: Stork fullname: Stork, D.G.  | 
    
| BookMark | eNotj91KwzAYQANOcN18AL3KC7Tmy5emqXej6BQGXmy7Hmn6RSP9o6nC3t7BvDicuwMnYYt-6ImxBxAZgCifNtV-X2VQlipTBeRgblgiDBitlchxwZYgcpNqLPGOJTF-CyGFNHLJno8x9J-8IT9Mna1b4jN1Y2tninweeOg9Tfw3xB_b8jgSuS_enHvbBRfX7NbbNtL9v1fs8PpyqN7S3cf2vdrs0gCFmlNCpa1rpEC8oEGpHLTTBhqDzmENoH0BTV4K5QzUha6l92QuW9JgXeCKPV6zgYhO4xQ6O51P10v8A26yRx8 | 
    
| ContentType | Conference Proceeding | 
    
| DBID | 6IE 6IL CBEJK RIE RIL  | 
    
| DOI | 10.1109/ACSSC.1994.471518 | 
    
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE/IET Electronic Library IEEE Proceedings Order Plans (POP All) 1998-Present  | 
    
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher  | 
    
| DeliveryMethod | fulltext_linktorsrc | 
    
| Discipline | Engineering | 
    
| EndPage | 582 vol.1 | 
    
| ExternalDocumentID | 471518 | 
    
| GroupedDBID | 29F 6IE 6IF 6IH 6IK 6IL 6IM 6IN AAJGR AAWTH ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP IPLJI M43 OCL RIE RIL RNS  | 
    
| ID | FETCH-LOGICAL-i174t-e346acd20332036144516c681d83cc3b116f71d5904c81b76b2ffe8110283b73 | 
    
| IEDL.DBID | RIE | 
    
| ISBN | 0818664053 9780818664052  | 
    
| ISSN | 1058-6393 | 
    
| IngestDate | Tue Aug 26 23:01:09 EDT 2025 | 
    
| IsPeerReviewed | false | 
    
| IsScholarly | true | 
    
| Language | English | 
    
| LinkModel | DirectLink | 
    
| MergedId | FETCHMERGED-LOGICAL-i174t-e346acd20332036144516c681d83cc3b116f71d5904c81b76b2ffe8110283b73 | 
    
| ParticipantIDs | ieee_primary_471518 | 
    
| PublicationCentury | 1900 | 
    
| PublicationDate | 19940000 | 
    
| PublicationDateYYYYMMDD | 1994-01-01 | 
    
| PublicationDate_xml | – year: 1994 text: 19940000  | 
    
| PublicationDecade | 1990 | 
    
| PublicationTitle | Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers | 
    
| PublicationTitleAbbrev | ACSSC | 
    
| PublicationYear | 1994 | 
    
| Publisher | IEEE Comput. Soc. Press | 
    
| Publisher_xml | – name: IEEE Comput. Soc. Press | 
    
| SSID | ssj0020282 ssj0000451461  | 
    
| Score | 1.4913826 | 
    
| Snippet | The visual image of a talker provides information complementary to the acoustic speech waveform, and enables improved recognition accuracy, especially in... | 
    
| SourceID | ieee | 
    
| SourceType | Publisher | 
    
| StartPage | 578 | 
    
| SubjectTerms | Acoustic noise Acoustic waves Data mining Image recognition Lighting Lips Mouth Noise robustness Speech enhancement Speech recognition  | 
    
| Title | Using deformable templates to infer visual speech dynamics | 
    
| URI | https://ieeexplore.ieee.org/document/471518 | 
    
| Volume | 1 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjZ1LS8NAEMcX25Ne1FrxzR68bprNJpvEmxRLERShFXor-5hisbSlSTz46d3ZxPrAg7ckh4RNlsx_Xr8h5DqyqeUggfFMGRYbkTKlTMqsjXB7WGEBQwMPj3L4HN9PkknD2fa9MADgi88gwEOfy7crU2GorOd-pAnPWqSVZrJu1dqGUxCT4sloja-FroRPdCYZc0ZYePQjwt2cQBENeOfzPGqynTzMe7f90aiPLXxxUD_tx9QVb3QG-3U3d-FZhVhr8hpUpQ7M-y-S4z_Xc0C6X9199Glrtw7JDiw7ZO8bmPCI3PhKAmrBS1q9AIoIqwXqUlquKFZwbejbvKjUghZrAPNCbT3avuiS8eBu3B-yZsoCmztvpGQgYqmMjUIhMCnJkVgmjXQ6NhPGCM25nKXcJnkYG6dxU6mj2Qwy7pWJTsUxaS9XSzghFCIV525TqNzdU1urlXHOkeE61Cq3Nj0lHXwH03XN0ZjWyz_78-o52a2pxRjsuCDtclPBpTP_pb7yH_4DpJCouA | 
    
| linkProvider | IEEE | 
    
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjZ05T8MwFMctKAOwAKWIGw-sTuPYudhQRVWgrZBapG6Vj1dRUbVVkzDw6bGdUA4xsCUZEjl68vu_62eErgMdawoREJoIRbhiMRFCxUTrwJqHZhpsaqDXjzrP_GEUjirOtpuFAQDXfAaevXS1fL1QhU2VNc1GGtJkE22FnPOwHNZaJ1QsKMWx0apoywYTrtQZJsS4YebgjxbvZiQKq9A7n_dBVe-kftq8bQ0GLTvEx73yez_OXXFup71XznNnjlZou01evSKXnnr_xXL854r2UeNrvg8_rT3XAdqAeR3tfkMTHqIb10uANThRK2eALcRqZpUpzhfY9nCt8Ns0K8QMZ0sA9YJ1ebh91kDD9t2w1SHVOQtkauKRnADjkVA68BmzZUlqmWWRioySTZhSTFIaTWKqw9TnyqjcOJLBZAIJddpExuwI1eaLORwjDIHgqTELkZp3Sq2lUCY8UlT6UqRaxyeobv_BeFmSNMbl8k__fHqFtjvDXnfcve8_nqGdkmFsUx_nqJavCrgwYiCXl84IPgCyz6wF | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+1994+28th+Asilomar+Conference+on+Signals%2C+Systems+and+Computers&rft.atitle=Using+deformable+templates+to+infer+visual+speech+dynamics&rft.au=Hennecke%2C+M.E.&rft.au=Prasad%2C+K.V.&rft.au=Stork%2C+D.G.&rft.date=1994-01-01&rft.pub=IEEE+Comput.+Soc.+Press&rft.isbn=9780818664052&rft.issn=1058-6393&rft.volume=1&rft.spage=578&rft.epage=582+vol.1&rft_id=info:doi/10.1109%2FACSSC.1994.471518&rft.externalDocID=471518 | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1058-6393&client=summon | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1058-6393&client=summon | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1058-6393&client=summon |