Speech feature extraction method representing periodicity and aperiodicity in sub bands for robust speech recognition

This paper proposes a feature extraction method that represents both the periodicity and aperiodicity of speech for robust speech recognition. The development of this feature extraction method was motivated by findings in speech perception research. With this method, the speech sound is filtered by...

Full description

Saved in:

Bibliographic Details
Published in	2004 IEEE International Conference on Acoustics, Speech and Signal Processing Vol. 1; pp. I - 141
Main Authors	Ishizuka, K., Miyazaki, N.
Format	Conference Proceeding
Language	English Japanese
Published	Piscataway, N.J IEEE 28.09.2004
Subjects	1f noise Applied sciences Automatic speech recognition Detection, estimation, filtering, equalization, prediction Exact sciences and technology Feature extraction Filter bank Filtering Information, signal and communications theory Power harmonic filters Robustness Signal and communications theory Signal design Signal processing Signal, noise Spatial databases Speech processing Speech recognition Telecommunications and information theory Harmonic Speech analysis Filtering Acoustic signal Error rate Comb filters Verbal perception Output signal Japanese Cepstral analysis Audio signal Vowel Speech recognition Database Signal processing Feature extraction Filter bank Automatic recognition Speech processing
Online Access	Get full text
ISBN	9780780384842 0780384849
ISSN	1520-6149
DOI	10.1109/ICASSP.2004.1325942

Cover

Abstract	This paper proposes a feature extraction method that represents both the periodicity and aperiodicity of speech for robust speech recognition. The development of this feature extraction method was motivated by findings in speech perception research. With this method, the speech sound is filtered by Gammatone filter banks, and then the output of each filter is comb filtered. Individual comb filters designed for each output signal of the Gammatone filter are used to divide the output of each filter into its periodic and aperiodic features in the sub band. The power suppressed by comb filtering is considered to be a periodic feature, whereas the power of the residue after comb filtering is considered to be an aperiodic feature. This method uses both features as the feature parameters for automatic speech recognition. A preliminary experiment using a five vowel recognition task designed to compare the proposed approach with the conventional MFCC-based feature extraction method shows that the proposed method improves vowel recognition rates by as much as 14.7 % in the presence of pink noise or a harmonic complex tone interferer. An evaluation experiment undertaken using the Aurora-2J database (Japanese noisy digit recognition database) to compare the proposed approach with the MFCC-based conventional (baseline) feature extraction method shows that the proposed method reduces the word error rate by as much as 59.62 %, with an average value of 18.21 %.
AbstractList	This paper proposes a feature extraction method that represents both the periodicity and aperiodicity of speech for robust speech recognition. The development of this feature extraction method was motivated by findings in speech perception research. With this method, the speech sound is filtered by Gammatone filter banks, and then the output of each filter is comb filtered. Individual comb filters designed for each output signal of the Gammatone filter are used to divide the output of each filter into its periodic and aperiodic features in the sub band. The power suppressed by comb filtering is considered to be a periodic feature, whereas the power of the residue after comb filtering is considered to be an aperiodic feature. This method uses both features as the feature parameters for automatic speech recognition. A preliminary experiment using a five vowel recognition task designed to compare the proposed approach with the conventional MFCC-based feature extraction method shows that the proposed method improves vowel recognition rates by as much as 14.7 % in the presence of pink noise or a harmonic complex tone interferer. An evaluation experiment undertaken using the Aurora-2J database (Japanese noisy digit recognition database) to compare the proposed approach with the MFCC-based conventional (baseline) feature extraction method shows that the proposed method reduces the word error rate by as much as 59.62 %, with an average value of 18.21 %.
Author	Ishizuka, K. Miyazaki, N.
Author_xml	– sequence: 1 givenname: K. surname: Ishizuka fullname: Ishizuka, K. organization: NTT Commun. Sci. Labs., NTT Corp., Tokyo, Japan – sequence: 2 givenname: N. surname: Miyazaki fullname: Miyazaki, N. organization: NTT Commun. Sci. Labs., NTT Corp., Tokyo, Japan
BackLink	http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=17565876$$DView record in Pascal Francis
BookMark	eNpNkFtLAzEQhQNWsK39BX3Ji49bc9tcHqWoFQoK7XvJZmfbSJssyS7Yf-_KCgoDB-Z8nGHODE1CDIDQkpIVpcQ8vq2fdruPFSNErChnpRHsBi2M0mQYroUWbIKmtGSkkFSYOzTL-ZMQopXQU9TvWgB3wg3Yrk-A4atL1nU-BnyB7hRrnKBNkCF0PhxxC8nH2jvfXbENNbb_Fz7g3Fe4GoyMm5hwilWfO5zHEwlcPAb_k32Pbht7zrD41Tnavzzv15ti-_46_LMtPBWlLBSljTRaMVZVrnKcE61BSVqCZhK4tYbVppFcU11LpZx0NWNMCS6VFdTyOXoYY1ubnT03yQbn86FN_mLT9UBVKUut5MAtR84DwJ89lsm_AWqCbFI
ContentType	Conference Proceeding
Copyright	2006 INIST-CNRS
Copyright_xml	– notice: 2006 INIST-CNRS
DBID	6IE 6IH CBEJK RIE RIO IQODW
DOI	10.1109/ICASSP.2004.1325942
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present Pascal-Francis
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering Applied Sciences
EndPage	141
ExternalDocumentID	17565876 1325942
Genre	orig-research
GroupedDBID	23M 29P 6IE 6IF 6IH 6IK 6IL 6IM 6IN AAJGR AAWTH ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP IPLJI M43 OCL RIE RIL RIO RNS AAVQY IQODW RIB RIC
ID	FETCH-LOGICAL-i1456-711f698722bbcbc33088e7615e826e3aa92d9f63818d677c6cd22274367a41a3
IEDL.DBID	RIE
ISBN	9780780384842 0780384849
ISSN	1520-6149
IngestDate	Wed Apr 02 07:25:29 EDT 2025 Tue Aug 26 18:33:07 EDT 2025
IsPeerReviewed	false
IsScholarly	true
Keywords	Harmonic Speech analysis Filtering Acoustic signal Error rate Comb filters Verbal perception Output signal Japanese Cepstral analysis Audio signal Vowel Speech recognition Database Signal processing Feature extraction Filter bank Automatic recognition Speech processing
Language	English Japanese
License	CC BY 4.0
LinkModel	DirectLink
MeetingName	2004 IEEE International Conference on Acoustics, Speech, and Signal Processing (proceedings)
MergedId	FETCHMERGED-LOGICAL-i1456-711f698722bbcbc33088e7615e826e3aa92d9f63818d677c6cd22274367a41a3
ParticipantIDs	ieee_primary_1325942 pascalfrancis_primary_17565876
PublicationCentury	2000
PublicationDate	2004-09-28
PublicationDateYYYYMMDD	2004-09-28
PublicationDate_xml	– month: 09 year: 2004 text: 2004-09-28 day: 28
PublicationDecade	2000
PublicationPlace	Piscataway, N.J
PublicationPlace_xml	– name: Piscataway, N.J
PublicationTitle	2004 IEEE International Conference on Acoustics, Speech and Signal Processing
PublicationTitleAbbrev	ICASSP
PublicationYear	2004
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0008748 ssj0000454154
Score	1.55485
Snippet	This paper proposes a feature extraction method that represents both the periodicity and aperiodicity of speech for robust speech recognition. The development...
SourceID	pascalfrancis ieee
SourceType	Index Database Publisher
StartPage	I
SubjectTerms	1f noise Applied sciences Automatic speech recognition Detection, estimation, filtering, equalization, prediction Exact sciences and technology Feature extraction Filter bank Filtering Information, signal and communications theory Power harmonic filters Robustness Signal and communications theory Signal design Signal processing Signal, noise Spatial databases Speech processing Speech recognition Telecommunications and information theory
Title	Speech feature extraction method representing periodicity and aperiodicity in sub bands for robust speech recognition
URI	https://ieeexplore.ieee.org/document/1325942
Volume	1
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3LT8IwGG-Qk158gBEfpAePbrC1a7ejIRI0wZCACTfSdl0kJhth20H_er92Y6Dx4G3vpo99j37f7_chdA-_j4aVQBzJpAkzEuFIPxCOIAH1wiRIpGXbn76yyRt9WQbLFnposDBaa5t8pl1zaGP5caZKs1U2AM8piCgI3CMesgqr1eynGCo5z6jGWgqH3FbOAvVk3CMaWZc9HJKQhjSqmXd2535NR-QNo8Hz6HE-n1nH0a3bqwuvmLRJkcPIJVXJiwM9ND5F010PqvSTD7cspKu-fpE7_reLZ6i7R_zhWaPLzlFLpxfo5ICssIPK-UZr9Y4TbblAMUj1bYWKwFUZamwJMg2YCZ7HhkE5M3H74hOLNMbi8MI6xXkpsTQ4YwxmM95msswLnFdNNFlNWdpFi_HTYjRx6qINztoDY8zhnpewKOS-L6WSihAQY5qD3aTBkdFEiMiPo4QZQyFmnCumYgPHpYRxQT1BLlE7zVJ9hbAWggdKcJgvTQUoVY8p81XJ_ZhKSnqoY8ZvtaloOVb10PVQ_8c07e9zMF5B-F___d4NOq6yciLHD29Ru9iW-g4MjkL27Ur7BjGOz3Q
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3LT8IwGG-IHtSLDzDiA3vw6ICt3bodDZGAAiEBE26k7bpITDbCtoP-9X7txkDjwdveTR_7Hv2-3-9D6AF-HwUrgVjCEzrMSLglHJdbnLjU9iM3EoZtfzzxBm_0ZeEuauixwsIopUzymWrrQxPLDxOZ662yDnhObkBB4B66lFK3QGtVOyqaTM7WyrGUwz4ztbNAQWkHiQbGafe7xKc-DUrune25UxIS2d2gM-w9zWZT4zq2yxbL0is6cZKnMHZRUfRiTxP1T9F424ciAeWjnWeiLb9-0Tv-t5NnqLHD_OFppc3OUU3FF-hkj66wjvLZWin5jiNl2EAxyPVNgYvARSFqbCgyNZwJnseaQznRkfvsE_M4xHz_wirGaS6w0EhjDIYz3iQiTzOcFk1UeU1J3EDz_vO8N7DKsg3WygZzzGK2HXmBzxxHCCkkISDIFAPLSYErowjngRMGkadNhdBjTHoy1IBcSjzGqc3JJTqIk1hdIaw4Z67kDOZLUQ5q1fak_qpgTkgFJU1U1-O3XBfEHMty6Jqo9WOadvcZmK8g_q__fu8eHQ3m49FyNJy83qDjIkcnsBz_Fh1km1zdgfmRiZZZdd86ztLB
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2004+IEEE+International+Conference+on+Acoustics%2C+Speech+and+Signal+Processing&rft.atitle=Speech+feature+extraction+method+representing+periodicity+and+aperiodicity+in+sub+bands+for+robust+speech+recognition&rft.au=ISHIZUKA%2C+Kentaro&rft.au=MIYAZAKI%2C+Noboru&rft.date=2004-09-28&rft.pub=IEEE&rft.isbn=9780780384842&rft_id=info:doi/10.1109%2FICASSP.2004.1325942&rft.externalDBID=n%2Fa&rft.externalDocID=17565876
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1520-6149&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1520-6149&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1520-6149&client=summon