Robust speaker recognition based on improved GFCC

Focused on the issue that the robustness of traditional Mel Frequency Cepstral Coefficients (MFCC) feature degrades drastically in speaker recognition system, a kind algorithm that based improved Gammatone Frequency Cepstral Coefficients (GFCC) is proposed. The different between traditional MFCC and...

Full description

Saved in:
Bibliographic Details
Published in2016 2nd IEEE International Conference on Computer and Communications (ICCC) pp. 1927 - 1931
Main Authors Xiaoyuan Shi, Haiyan Yang, Ping Zhou
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.10.2016
Subjects
Online AccessGet full text
DOI10.1109/CompComm.2016.7925037

Cover

Abstract Focused on the issue that the robustness of traditional Mel Frequency Cepstral Coefficients (MFCC) feature degrades drastically in speaker recognition system, a kind algorithm that based improved Gammatone Frequency Cepstral Coefficients (GFCC) is proposed. The different between traditional MFCC and GFCC is that GFCC uses Gammatone filter bank to replace Mel filter bank to improve robustness. On this basis, this paper proposes one way that use Multitaper Estimation, MVA (Mean Subtraction, Variance Normzlization and Autoregressive Moving Average Filter)and other technologies to further enhance its robustness and tested with TIMIT speech database. The experimental results show that under different noise and different SNR, the improved GFCC that proposed by this paper has the lowest equal error rate and the best robustness, especially in the noise ratio is lower than 10dB, has greater advantage compared to other algorithms.
AbstractList Focused on the issue that the robustness of traditional Mel Frequency Cepstral Coefficients (MFCC) feature degrades drastically in speaker recognition system, a kind algorithm that based improved Gammatone Frequency Cepstral Coefficients (GFCC) is proposed. The different between traditional MFCC and GFCC is that GFCC uses Gammatone filter bank to replace Mel filter bank to improve robustness. On this basis, this paper proposes one way that use Multitaper Estimation, MVA (Mean Subtraction, Variance Normzlization and Autoregressive Moving Average Filter)and other technologies to further enhance its robustness and tested with TIMIT speech database. The experimental results show that under different noise and different SNR, the improved GFCC that proposed by this paper has the lowest equal error rate and the best robustness, especially in the noise ratio is lower than 10dB, has greater advantage compared to other algorithms.
Author Xiaoyuan Shi
Ping Zhou
Haiyan Yang
Author_xml – sequence: 1
  surname: Xiaoyuan Shi
  fullname: Xiaoyuan Shi
  email: yhy@guet.edu.cn
  organization: Coll. of Inf. & Commun., Guilin Univ. of Electron. Technol., Guilin, China
– sequence: 2
  surname: Haiyan Yang
  fullname: Haiyan Yang
  email: yhy@guet.edu.cn
  organization: Coll. of Inf. & Commun., Guilin Univ. of Electron. Technol., Guilin, China
– sequence: 3
  surname: Ping Zhou
  fullname: Ping Zhou
  email: 316465910@qq.com
  organization: Coll. of Inf. & Commun., Guilin Univ. of Electron. Technol., Guilin, China
BookMark eNotj8tKxDAYhSPoQsd5AhH6Aq1_Ls1lKcEZBwYEmf2Qy18J2qakVfDtDTiLw_etDufckespT0jII4WOUjBPNo9zzdgxoLJThvXA1RXZGqWpkIobYJLdEvqe_feyNsuM7hNLUzDkjymtKU-NdwvGpkoa55J_qu931t6Tm8F9Lbi9cENOu5eTfW2Pb_uDfT62ycDaCg0KgguKGo3aexe1Cr1BHkAM6H3fV6CMQkg6MEdjlMGbYOo8BkYIviEP_7UJEc9zSaMrv-fLD_4HqrZC4Q
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/CompComm.2016.7925037
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9781467390262
1467390267
9781467390255
1467390259
EndPage 1931
ExternalDocumentID 7925037
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i90t-48070cac7198e8bbad87c59e3c04febb554fee6d4461f2a1dd6cb9c9978209443
IEDL.DBID RIE
IngestDate Thu Jun 29 18:37:51 EDT 2023
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i90t-48070cac7198e8bbad87c59e3c04febb554fee6d4461f2a1dd6cb9c9978209443
PageCount 5
ParticipantIDs ieee_primary_7925037
PublicationCentury 2000
PublicationDate 2016-Oct.
PublicationDateYYYYMMDD 2016-10-01
PublicationDate_xml – month: 10
  year: 2016
  text: 2016-Oct.
PublicationDecade 2010
PublicationTitle 2016 2nd IEEE International Conference on Computer and Communications (ICCC)
PublicationTitleAbbrev CompComm
PublicationYear 2016
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.6496682
Snippet Focused on the issue that the robustness of traditional Mel Frequency Cepstral Coefficients (MFCC) feature degrades drastically in speaker recognition system,...
SourceID ieee
SourceType Publisher
StartPage 1927
SubjectTerms Filter banks
Filtering algorithms
GFCC
Mel frequency cepstral coefficient
Multitaper estimation
MVA
Robustness
speaker recognition
Title Robust speaker recognition based on improved GFCC
URI https://ieeexplore.ieee.org/document/7925037
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFA5zJ08qm_ibHjyaLlmbJjkP5xAmIhN2G3nJK4xBO7b24l9v0taJ4sHbIwSSlxC-5OX73iPkfqwNs8xaajyc0RSkoEagopAaFM5DUt6IxOYv2ew9fV6KZY88HLQwiNiQzzAOZvOX70pbh1DZSGoP2Ik8IkdSZa1WqxPlcKZH4QAFVUXga2Vx1_dH0ZQGM6YnZP41WksV2cR1BbH9-JWI8b_TOSXDb3Ve9HrAnTPSw2JA-FsJ9b6K9ls0G9xFB2JQWUQBqVzkjXUTQfD203QyGZLF9HExmdGuHAJda1bRoP1m1ljJtUIFYJySVmhMLEtzBPD3ghwxc_59x_Ox4c5lFrTVOmTE02manJN-URZ4QaKca0TpgCvwICacSoBhrsHosUtEIi7JIHi72rYJL1ado1d_N1-T47DiLcPthvSrXY23HqkruGu26BPWSZY7
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFH7MedCTyib-tgePpkvXpG3Owzl1GyITdhv58Qpj0I6tvfjXm7R1onjw9giB5CWEL3n5vvcA7vpCUk21JtLCGWEq5kRyTIhiErmxkJRWIrHJNBq9s-c5n7fgfqeFQcSKfIa-M6u_fJPr0oXKerGwgB3Ge7DPGWO8Vms1spyAip47Qk5X4Rhbkd_0_lE2pUKN4RFMvsarySIrvyyUrz9-pWL874SOofutz_Ned8hzAi3MOhC85arcFt52jXKFG29HDcozz2GV8ayxrGII1n4cDgZdmA0fZoMRaQoikKWgBXHqb6qljgORYKKUNEmsucBQU5aiUvZmkCJGxr7wgrQvA2MirYQWwuXEE4yFp9DO8gzPwEsDgRgbFSTKwhg3SagopkJJ0TchD_k5dJy3i3Wd8mLROHrxd_MtHIxmk_Fi_DR9uYRDt_o13-0K2sWmxGuL24W6qbbrE-WzmYg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2016+2nd+IEEE+International+Conference+on+Computer+and+Communications+%28ICCC%29&rft.atitle=Robust+speaker+recognition+based+on+improved+GFCC&rft.au=Xiaoyuan+Shi&rft.au=Haiyan+Yang&rft.au=Ping+Zhou&rft.date=2016-10-01&rft.pub=IEEE&rft.spage=1927&rft.epage=1931&rft_id=info:doi/10.1109%2FCompComm.2016.7925037&rft.externalDocID=7925037