Speech feature extraction method representing periodicity and aperiodicity in sub bands for robust speech recognition
This paper proposes a feature extraction method that represents both the periodicity and aperiodicity of speech for robust speech recognition. The development of this feature extraction method was motivated by findings in speech perception research. With this method, the speech sound is filtered by...
        Saved in:
      
    
          | Published in | 2004 IEEE International Conference on Acoustics, Speech and Signal Processing Vol. 1; pp. I - 141 | 
|---|---|
| Main Authors | , | 
| Format | Conference Proceeding | 
| Language | English Japanese  | 
| Published | 
        Piscataway, N.J
          IEEE
    
        28.09.2004
     | 
| Subjects | |
| Online Access | Get full text | 
| ISBN | 9780780384842 0780384849  | 
| ISSN | 1520-6149 | 
| DOI | 10.1109/ICASSP.2004.1325942 | 
Cover
| Abstract | This paper proposes a feature extraction method that represents both the periodicity and aperiodicity of speech for robust speech recognition. The development of this feature extraction method was motivated by findings in speech perception research. With this method, the speech sound is filtered by Gammatone filter banks, and then the output of each filter is comb filtered. Individual comb filters designed for each output signal of the Gammatone filter are used to divide the output of each filter into its periodic and aperiodic features in the sub band. The power suppressed by comb filtering is considered to be a periodic feature, whereas the power of the residue after comb filtering is considered to be an aperiodic feature. This method uses both features as the feature parameters for automatic speech recognition. A preliminary experiment using a five vowel recognition task designed to compare the proposed approach with the conventional MFCC-based feature extraction method shows that the proposed method improves vowel recognition rates by as much as 14.7 % in the presence of pink noise or a harmonic complex tone interferer. An evaluation experiment undertaken using the Aurora-2J database (Japanese noisy digit recognition database) to compare the proposed approach with the MFCC-based conventional (baseline) feature extraction method shows that the proposed method reduces the word error rate by as much as 59.62 %, with an average value of 18.21 %. | 
    
|---|---|
| AbstractList | This paper proposes a feature extraction method that represents both the periodicity and aperiodicity of speech for robust speech recognition. The development of this feature extraction method was motivated by findings in speech perception research. With this method, the speech sound is filtered by Gammatone filter banks, and then the output of each filter is comb filtered. Individual comb filters designed for each output signal of the Gammatone filter are used to divide the output of each filter into its periodic and aperiodic features in the sub band. The power suppressed by comb filtering is considered to be a periodic feature, whereas the power of the residue after comb filtering is considered to be an aperiodic feature. This method uses both features as the feature parameters for automatic speech recognition. A preliminary experiment using a five vowel recognition task designed to compare the proposed approach with the conventional MFCC-based feature extraction method shows that the proposed method improves vowel recognition rates by as much as 14.7 % in the presence of pink noise or a harmonic complex tone interferer. An evaluation experiment undertaken using the Aurora-2J database (Japanese noisy digit recognition database) to compare the proposed approach with the MFCC-based conventional (baseline) feature extraction method shows that the proposed method reduces the word error rate by as much as 59.62 %, with an average value of 18.21 %. | 
    
| Author | Ishizuka, K. Miyazaki, N.  | 
    
| Author_xml | – sequence: 1 givenname: K. surname: Ishizuka fullname: Ishizuka, K. organization: NTT Commun. Sci. Labs., NTT Corp., Tokyo, Japan – sequence: 2 givenname: N. surname: Miyazaki fullname: Miyazaki, N. organization: NTT Commun. Sci. Labs., NTT Corp., Tokyo, Japan  | 
    
| BackLink | http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=17565876$$DView record in Pascal Francis | 
    
| BookMark | eNpNkFtLAzEQhQNWsK39BX3Ji49bc9tcHqWoFQoK7XvJZmfbSJssyS7Yf-_KCgoDB-Z8nGHODE1CDIDQkpIVpcQ8vq2fdruPFSNErChnpRHsBi2M0mQYroUWbIKmtGSkkFSYOzTL-ZMQopXQU9TvWgB3wg3Yrk-A4atL1nU-BnyB7hRrnKBNkCF0PhxxC8nH2jvfXbENNbb_Fz7g3Fe4GoyMm5hwilWfO5zHEwlcPAb_k32Pbht7zrD41Tnavzzv15ti-_46_LMtPBWlLBSljTRaMVZVrnKcE61BSVqCZhK4tYbVppFcU11LpZx0NWNMCS6VFdTyOXoYY1ubnT03yQbn86FN_mLT9UBVKUut5MAtR84DwJ89lsm_AWqCbFI | 
    
| ContentType | Conference Proceeding | 
    
| Copyright | 2006 INIST-CNRS | 
    
| Copyright_xml | – notice: 2006 INIST-CNRS | 
    
| DBID | 6IE 6IH CBEJK RIE RIO IQODW  | 
    
| DOI | 10.1109/ICASSP.2004.1325942 | 
    
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present Pascal-Francis  | 
    
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher  | 
    
| DeliveryMethod | fulltext_linktorsrc | 
    
| Discipline | Engineering Applied Sciences  | 
    
| EndPage | 141 | 
    
| ExternalDocumentID | 17565876 1325942  | 
    
| Genre | orig-research | 
    
| GroupedDBID | 23M 29P 6IE 6IF 6IH 6IK 6IL 6IM 6IN AAJGR AAWTH ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP IPLJI M43 OCL RIE RIL RIO RNS AAVQY IQODW RIB RIC  | 
    
| ID | FETCH-LOGICAL-i1456-711f698722bbcbc33088e7615e826e3aa92d9f63818d677c6cd22274367a41a3 | 
    
| IEDL.DBID | RIE | 
    
| ISBN | 9780780384842 0780384849  | 
    
| ISSN | 1520-6149 | 
    
| IngestDate | Wed Apr 02 07:25:29 EDT 2025 Tue Aug 26 18:33:07 EDT 2025  | 
    
| IsPeerReviewed | false | 
    
| IsScholarly | true | 
    
| Keywords | Harmonic Speech analysis Filtering Acoustic signal Error rate Comb filters Verbal perception Output signal Japanese Cepstral analysis Audio signal Vowel Speech recognition Database Signal processing Feature extraction Filter bank Automatic recognition Speech processing  | 
    
| Language | English Japanese  | 
    
| License | CC BY 4.0 | 
    
| LinkModel | DirectLink | 
    
| MeetingName | 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing (proceedings) | 
    
| MergedId | FETCHMERGED-LOGICAL-i1456-711f698722bbcbc33088e7615e826e3aa92d9f63818d677c6cd22274367a41a3 | 
    
| ParticipantIDs | ieee_primary_1325942 pascalfrancis_primary_17565876  | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | 2004-09-28 | 
    
| PublicationDateYYYYMMDD | 2004-09-28 | 
    
| PublicationDate_xml | – month: 09 year: 2004 text: 2004-09-28 day: 28  | 
    
| PublicationDecade | 2000 | 
    
| PublicationPlace | Piscataway, N.J | 
    
| PublicationPlace_xml | – name: Piscataway, N.J | 
    
| PublicationTitle | 2004 IEEE International Conference on Acoustics, Speech and Signal Processing | 
    
| PublicationTitleAbbrev | ICASSP | 
    
| PublicationYear | 2004 | 
    
| Publisher | IEEE | 
    
| Publisher_xml | – name: IEEE | 
    
| SSID | ssj0008748 ssj0000454154  | 
    
| Score | 1.55485 | 
    
| Snippet | This paper proposes a feature extraction method that represents both the periodicity and aperiodicity of speech for robust speech recognition. The development... | 
    
| SourceID | pascalfrancis ieee  | 
    
| SourceType | Index Database Publisher  | 
    
| StartPage | I | 
    
| SubjectTerms | 1f noise Applied sciences Automatic speech recognition Detection, estimation, filtering, equalization, prediction Exact sciences and technology Feature extraction Filter bank Filtering Information, signal and communications theory Power harmonic filters Robustness Signal and communications theory Signal design Signal processing Signal, noise Spatial databases Speech processing Speech recognition Telecommunications and information theory  | 
    
| Title | Speech feature extraction method representing periodicity and aperiodicity in sub bands for robust speech recognition | 
    
| URI | https://ieeexplore.ieee.org/document/1325942 | 
    
| Volume | 1 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3LT8IwGG-Qk158gBEfpAePbrC1a7ejIRI0wZCACTfSdl0kJhth20H_er92Y6Dx4G3vpo99j37f7_chdA-_j4aVQBzJpAkzEuFIPxCOIAH1wiRIpGXbn76yyRt9WQbLFnposDBaa5t8pl1zaGP5caZKs1U2AM8piCgI3CMesgqr1eynGCo5z6jGWgqH3FbOAvVk3CMaWZc9HJKQhjSqmXd2535NR-QNo8Hz6HE-n1nH0a3bqwuvmLRJkcPIJVXJiwM9ND5F010PqvSTD7cspKu-fpE7_reLZ6i7R_zhWaPLzlFLpxfo5ICssIPK-UZr9Y4TbblAMUj1bYWKwFUZamwJMg2YCZ7HhkE5M3H74hOLNMbi8MI6xXkpsTQ4YwxmM95msswLnFdNNFlNWdpFi_HTYjRx6qINztoDY8zhnpewKOS-L6WSihAQY5qD3aTBkdFEiMiPo4QZQyFmnCumYgPHpYRxQT1BLlE7zVJ9hbAWggdKcJgvTQUoVY8p81XJ_ZhKSnqoY8ZvtaloOVb10PVQ_8c07e9zMF5B-F___d4NOq6yciLHD29Ru9iW-g4MjkL27Ur7BjGOz3Q | 
    
| linkProvider | IEEE | 
    
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3LT8IwGG-IHtSLDzDiA3vw6ICt3bodDZGAAiEBE26k7bpITDbCtoP-9X7txkDjwdveTR_7Hv2-3-9D6AF-HwUrgVjCEzrMSLglHJdbnLjU9iM3EoZtfzzxBm_0ZeEuauixwsIopUzymWrrQxPLDxOZ662yDnhObkBB4B66lFK3QGtVOyqaTM7WyrGUwz4ztbNAQWkHiQbGafe7xKc-DUrune25UxIS2d2gM-w9zWZT4zq2yxbL0is6cZKnMHZRUfRiTxP1T9F424ciAeWjnWeiLb9-0Tv-t5NnqLHD_OFppc3OUU3FF-hkj66wjvLZWin5jiNl2EAxyPVNgYvARSFqbCgyNZwJnseaQznRkfvsE_M4xHz_wirGaS6w0EhjDIYz3iQiTzOcFk1UeU1J3EDz_vO8N7DKsg3WygZzzGK2HXmBzxxHCCkkISDIFAPLSYErowjngRMGkadNhdBjTHoy1IBcSjzGqc3JJTqIk1hdIaw4Z67kDOZLUQ5q1fak_qpgTkgFJU1U1-O3XBfEHMty6Jqo9WOadvcZmK8g_q__fu8eHQ3m49FyNJy83qDjIkcnsBz_Fh1km1zdgfmRiZZZdd86ztLB | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2004+IEEE+International+Conference+on+Acoustics%2C+Speech+and+Signal+Processing&rft.atitle=Speech+feature+extraction+method+representing+periodicity+and+aperiodicity+in+sub+bands+for+robust+speech+recognition&rft.au=ISHIZUKA%2C+Kentaro&rft.au=MIYAZAKI%2C+Noboru&rft.date=2004-09-28&rft.pub=IEEE&rft.isbn=9780780384842&rft_id=info:doi/10.1109%2FICASSP.2004.1325942&rft.externalDBID=n%2Fa&rft.externalDocID=17565876 | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1520-6149&client=summon | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1520-6149&client=summon | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1520-6149&client=summon |