Estimation of Glottal Closing and Opening Instants in Voiced Speech Using the YAGA Algorithm

Accurate estimation of glottal closing instants (GCIs) and opening instants (GOIs) is important for speech processing applications that benefit from glottal-synchronous processing including pitch tracking, prosodic speech modification, speech dereverberation, synthesis and study of pathological voic...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on audio, speech, and language processing Vol. 20; no. 1; pp. 82 - 91
Main Authors Thomas, M. R. P., Gudnason, J., Naylor, P. A.
Format Journal Article
LanguageEnglish
Published Piscataway, NJ IEEE 01.01.2012
Institute of Electrical and Electronics Engineers
Subjects
Online AccessGet full text
ISSN1558-7916
1558-7924
DOI10.1109/TASL.2011.2157684

Cover

Abstract Accurate estimation of glottal closing instants (GCIs) and opening instants (GOIs) is important for speech processing applications that benefit from glottal-synchronous processing including pitch tracking, prosodic speech modification, speech dereverberation, synthesis and study of pathological voice. We propose the Yet Another GCI/GOI Algorithm (YAGA) to detect GCIs from speech signals by employing multiscale analysis, the group delay function, and N -best dynamic programming. A novel GOI detector based upon the consistency of the candidates' closed quotients relative to the estimated GCIs is also presented. Particular attention is paid to the precise definition of the glottal closed phase, which we define as the analysis interval that produces minimum deviation from an all-pole model of the speech signal with closed-phase linear prediction (LP). A reference algorithm analyzing both electroglottograph (EGG) and speech signals is described for evaluation of the proposed speech-based algorithm. In addition to the development of a GCI/GOI detector, an important outcome of this work is in demonstrating that GOIs derived from the EGG signal are not necessarily well-suited to closed-phase LP analysis. Evaluation of YAGA against the APLAWD and SAM databases show that GCI identification rates of up to 99.3% can be achieved with an accuracy of 0.3 ms and GOI detection can be achieved equally reliably with an accuracy of 0.5 ms.
AbstractList Accurate estimation of glottal closing instants (GCIs) and opening instants (GOIs) is important for speech processing applications that benefit from glottal-synchronous processing including pitch tracking, prosodic speech modification, speech dereverberation, synthesis and study of pathological voice. We propose the Yet Another GCI/GOI Algorithm (YAGA) to detect GCIs from speech signals by employing multiscale analysis, the group delay function, and N -best dynamic programming. A novel GOI detector based upon the consistency of the candidates' closed quotients relative to the estimated GCIs is also presented. Particular attention is paid to the precise definition of the glottal closed phase, which we define as the analysis interval that produces minimum deviation from an all-pole model of the speech signal with closed-phase linear prediction (LP). A reference algorithm analyzing both electroglottograph (EGG) and speech signals is described for evaluation of the proposed speech-based algorithm. In addition to the development of a GCI/GOI detector, an important outcome of this work is in demonstrating that GOIs derived from the EGG signal are not necessarily well-suited to closed-phase LP analysis. Evaluation of YAGA against the APLAWD and SAM databases show that GCI identification rates of up to 99.3% can be achieved with an accuracy of 0.3 ms and GOI detection can be achieved equally reliably with an accuracy of 0.5 ms.
Author Gudnason, J.
Thomas, M. R. P.
Naylor, P. A.
Author_xml – sequence: 1
  givenname: M. R. P.
  surname: Thomas
  fullname: Thomas, M. R. P.
  email: mrt102@imperial.ac.uk
  organization: Electr. & Electron. Eng. Dept., Imperial Coll., London, UK
– sequence: 2
  givenname: J.
  surname: Gudnason
  fullname: Gudnason, J.
  email: jg@ru.is
  organization: Sch. of Sci. & Eng., Reykjavik Univ., Reykjavik, Iceland
– sequence: 3
  givenname: P. A.
  surname: Naylor
  fullname: Naylor, P. A.
  email: p.naylor@imperial.ac.uk
  organization: Electr. & Electron. Eng. Dept., Imperial Coll., London, UK
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=25473435$$DView record in Pascal Francis
BookMark eNp9kEFrwkAUhJdioWr7A0ove-kxdl-y2Y3HINYKgge1UCiEzeZFt8RNyO6l_76JWg899PQG3jcDMyMysLVFQh6BTQDY9GWbblaTkAFMQoilSPgNGUIcJ4Gchnxw1SDuyMi5L8Z4JDgMyefceXNU3tSW1iVdVLX3qqKzqnbG7qmyBV03aHu9tM4r6x01lr7XRmNBNw2iPtDdifUHpB_pIqVpta9b4w_He3Jbqsrhw-WOye51vp29Bav1YjlLV4GOBPgANY9kkQNTmORFrBIeIkKe5CXmndYlZ7GIMGSJkLKYsjBnSggAKBPBCsmiMXk-5zbKaVWVrbLauKxpu2btdxbGXEY8ijtOnjnd1s61WGba-FN33ypTZcCyfsysHzPrx8wuY3ZO-OP8Df_P83T2GES88rHsXiFEP4u_gUI
CODEN ITASD8
CitedBy_id crossref_primary_10_1109_TASL_2011_2170835
crossref_primary_10_1007_s00034_020_01373_2
crossref_primary_10_1109_TASL_2013_2273717
crossref_primary_10_1109_TASLP_2020_3044168
crossref_primary_10_1016_j_specom_2012_08_011
crossref_primary_10_1007_s00034_017_0654_y
crossref_primary_10_1007_s00034_018_0804_x
crossref_primary_10_1007_s00034_017_0713_4
crossref_primary_10_1109_TASLP_2017_2714839
crossref_primary_10_1016_j_csl_2017_05_008
crossref_primary_10_1109_LSP_2019_2929442
crossref_primary_10_1134_S1063771016020135
crossref_primary_10_1007_s10772_023_10060_x
crossref_primary_10_1134_S105466181602022X
crossref_primary_10_1007_s00034_020_01551_2
crossref_primary_10_1007_s00034_015_0166_6
crossref_primary_10_1016_j_specom_2017_12_002
crossref_primary_10_1109_TASLP_2018_2873897
crossref_primary_10_1007_s00034_014_9957_4
crossref_primary_10_1016_j_eswa_2022_116597
crossref_primary_10_1109_TASL_2013_2255275
crossref_primary_10_1109_TASLP_2021_3120585
crossref_primary_10_1016_j_csl_2014_03_003
crossref_primary_10_1007_s10772_020_09752_5
crossref_primary_10_1016_j_specom_2023_103006
crossref_primary_10_1109_TASLP_2018_2834733
crossref_primary_10_1109_TASLP_2015_2506263
crossref_primary_10_1016_j_bspc_2017_03_007
crossref_primary_10_1016_j_jvoice_2018_01_003
crossref_primary_10_1016_j_specom_2016_11_005
crossref_primary_10_1186_1687_4722_2013_3
crossref_primary_10_1109_TBME_2014_2318774
crossref_primary_10_1016_j_csl_2020_101097
crossref_primary_10_1007_s41870_024_02138_9
crossref_primary_10_1007_s10772_015_9316_2
crossref_primary_10_1016_j_specom_2019_11_004
crossref_primary_10_1109_LSP_2019_2921229
crossref_primary_10_1007_s00034_017_0582_x
crossref_primary_10_1109_TASLP_2017_2651391
crossref_primary_10_1016_j_specom_2016_12_004
crossref_primary_10_1016_j_sigpro_2013_07_029
crossref_primary_10_1121_1_4958681
crossref_primary_10_1121_1_5139225
crossref_primary_10_2139_ssrn_4146180
crossref_primary_10_1109_TASLP_2014_2352451
crossref_primary_10_1007_s10772_024_10137_1
crossref_primary_10_1109_LSP_2016_2519500
crossref_primary_10_1007_s00034_023_02312_7
crossref_primary_10_1016_j_dsp_2017_07_006
crossref_primary_10_1007_s10772_016_9383_z
crossref_primary_10_3390_sym16070788
crossref_primary_10_1109_ACCESS_2024_3454825
crossref_primary_10_1109_JPROC_2021_3126493
crossref_primary_10_1007_s10772_021_09810_6
crossref_primary_10_1016_j_csl_2022_101443
Cites_doi 10.1109/ICASSP.1995.479809
10.1109/TASSP.1986.1164909
10.1109/89.784109
10.1109/NORSIG.2006.275243
10.1109/LSP.2007.896454
10.1016/0167-6393(90)90021-Z
10.3109/02699208908985291
10.15837/ijccc.2008.1.2371
10.1109/ICASSP.1990.115542
10.1109/ICASSP.1985.1168147
10.1109/ICASSP.1987.1169874
10.1109/18.119752
10.1109/TSA.2005.857810
10.1109/TASL.2009.2022430
10.1109/ICTTA.2008.4530031
10.1109/TASL.2008.2012194
10.1016/0167-6393(92)90005-R
10.1109/ICASSP.2009.4960453
10.1109/18.119727
10.1109/TASSP.1974.1162572
10.1121/1.1912389
10.1109/ISCCSP.2004.1296465
10.1109/89.279274
10.1109/TASSP.1979.1163260
10.1044/jshr.3103.338
10.1109/TASL.2006.876878
10.1121/1.1646401
10.1121/1.1903487
10.1007/978-3-642-66286-7
10.1109/18.761341
10.1155/2007/62521
ContentType Journal Article
Copyright 2015 INIST-CNRS
Copyright_xml – notice: 2015 INIST-CNRS
DBID 97E
RIA
RIE
AAYXX
CITATION
IQODW
DOI 10.1109/TASL.2011.2157684
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Pascal-Francis
DatabaseTitle CrossRef
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Applied Sciences
EISSN 1558-7924
EndPage 91
ExternalDocumentID 25473435
10_1109_TASL_2011_2157684
5784321
Genre orig-research
GroupedDBID 0R~
29I
4.4
5GY
5VS
6IK
97E
AAJGR
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
AETIX
AGQYO
AGSQL
AHBIQ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
F5P
HZ~
IFIPE
IPLJI
JAVBF
LAI
M43
O9-
OCL
RIA
RIE
RNS
AAYXX
CITATION
IQODW
RIG
ID FETCH-LOGICAL-c361t-ec437db10ae8bd5a842ee1b8bfeb842cf40563e208677d902b0a66111f860d703
IEDL.DBID RIE
ISSN 1558-7916
IngestDate Mon Jul 21 09:14:21 EDT 2025
Thu Apr 24 23:12:38 EDT 2025
Wed Oct 01 01:44:53 EDT 2025
Tue Aug 26 17:18:10 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords glottal opening instants (GOIs)
multiscale analysis
Acoustic signal
Speech synthesis
glottal closing instants (GCIs)
Accuracy
Prosody
Vocal signal
group delay function
speech processing
Dynamic programming
Pitch(acoustics)
electroglottograph (EGG)
Group delay
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
CC BY 4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c361t-ec437db10ae8bd5a842ee1b8bfeb842cf40563e208677d902b0a66111f860d703
PageCount 10
ParticipantIDs ieee_primary_5784321
crossref_citationtrail_10_1109_TASL_2011_2157684
pascalfrancis_primary_25473435
crossref_primary_10_1109_TASL_2011_2157684
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2012-Jan.
2012-01-00
2012
PublicationDateYYYYMMDD 2012-01-01
PublicationDate_xml – month: 01
  year: 2012
  text: 2012-Jan.
PublicationDecade 2010
PublicationPlace Piscataway, NJ
PublicationPlace_xml – name: Piscataway, NJ
PublicationTitle IEEE transactions on audio, speech, and language processing
PublicationTitleAbbrev TASL
PublicationYear 2012
Publisher IEEE
Institute of Electrical and Electronics Engineers
Publisher_xml – name: IEEE
– name: Institute of Electrical and Electronics Engineers
References ref35
ref13
huckvale (ref26) 2004
ref34
ref37
ref15
ref14
ref31
ref30
ref32
markel (ref10) 1976
ref1
ref39
mckenna (ref11) 2001
ref38
ref16
ref18
davies (ref6) 1986; 8
thomas (ref3) 2009
chan (ref43) 1995
brookes (ref27) 2006; 14
gaubitch (ref2) 2008
ref24
ref23
ref25
ref20
ref22
ref21
lindsey (ref42) 1987
bouzid (ref19) 2008; iii
naylor (ref12) 2007; 15
ref28
scherer (ref7) 1995
ref29
ref8
fant (ref33) 1985; 26
chan (ref36) 1989; 33
ref9
ref4
sturmel (ref17) 2009
ref5
kawahara (ref41) 2000; 4
ref40
References_xml – ident: ref28
  doi: 10.1109/ICASSP.1995.479809
– ident: ref8
  doi: 10.1109/TASSP.1986.1164909
– ident: ref9
  doi: 10.1109/89.784109
– ident: ref13
  doi: 10.1109/NORSIG.2006.275243
– year: 1987
  ident: ref42
  publication-title: SPAR's archivable actual-word databases
– volume: 4
  start-page: 664
  year: 2000
  ident: ref41
  article-title: Accurate vocal event detection method based on a fixed-point analysis of mapping from time to weighted average group delay
  publication-title: Proc Int Conf Spoken Lang Process (ICSLP)
– ident: ref14
  doi: 10.1109/LSP.2007.896454
– ident: ref1
  doi: 10.1016/0167-6393(90)90021-Z
– volume: 8
  start-page: 539
  year: 1986
  ident: ref6
  article-title: Variation of glottal open and closed phases for speakers of English
  publication-title: Proc Inst Acoust
– ident: ref37
  doi: 10.3109/02699208908985291
– volume: iii
  start-page: 21
  year: 2008
  ident: ref19
  article-title: Electroglottographic measures based on gci and goi detection using multiscale product
  publication-title: Int J Comput Commun Control
  doi: 10.15837/ijccc.2008.1.2371
– volume: 26
  start-page: 1
  year: 1985
  ident: ref33
  article-title: A four-parameter model of glottal flow
  publication-title: STL-QPSR
– ident: ref29
  doi: 10.1109/ICASSP.1990.115542
– ident: ref32
  doi: 10.1109/ICASSP.1985.1168147
– ident: ref30
  doi: 10.1109/ICASSP.1987.1169874
– ident: ref16
  doi: 10.1109/18.119752
– start-page: 3965
  year: 2009
  ident: ref3
  article-title: Data-driven voice source waveform modeling
  publication-title: Proc IEEE Int Conf Acoust Speech Signal Process (ICASSP)
– volume: 14
  start-page: 456
  year: 2006
  ident: ref27
  article-title: A quantitative assessment of group delay methods for identifying glottal closures in voiced speech
  publication-title: IEEE Trans Speech Audio Process
  doi: 10.1109/TSA.2005.857810
– start-page: 867
  year: 1995
  ident: ref43
  article-title: EUROM-A spoken language resource for the EU
  publication-title: Proc Eur Conf Speech Commun Technol
– year: 2008
  ident: ref2
  article-title: Multi-microphone speech dereverberation using spatio-temporal and spectral processing
  publication-title: Proc Int Symp Circuits Syst
– ident: ref20
  doi: 10.1109/TASL.2009.2022430
– ident: ref22
  doi: 10.1109/ICTTA.2008.4530031
– ident: ref24
  doi: 10.1109/TASL.2008.2012194
– ident: ref35
  doi: 10.1016/0167-6393(92)90005-R
– ident: ref4
  doi: 10.1109/ICASSP.2009.4960453
– ident: ref18
  doi: 10.1109/18.119727
– ident: ref31
  doi: 10.1109/TASSP.1974.1162572
– ident: ref34
  doi: 10.1121/1.1912389
– volume: 33
  year: 1989
  ident: ref36
  article-title: Variability of excitation parameters derived from robust closed phase glottal inverse filtering
  publication-title: Proc Eur Conf Speech Commun Technol
– ident: ref25
  doi: 10.1109/ISCCSP.2004.1296465
– start-page: 269
  year: 1995
  ident: ref7
  publication-title: Producing Speech Contemporary Issues for Katherine Safford Harris
– ident: ref15
  doi: 10.1109/89.279274
– ident: ref5
  doi: 10.1109/TASSP.1979.1163260
– ident: ref38
  doi: 10.1044/jshr.3103.338
– volume: 15
  start-page: 34
  year: 2007
  ident: ref12
  article-title: Estimation of glottal closure instants in voiced speech using the DYPSA algorithm
  publication-title: IEEE Trans Speech Audio Process
  doi: 10.1109/TASL.2006.876878
– ident: ref40
  doi: 10.1121/1.1646401
– ident: ref23
  doi: 10.1121/1.1903487
– year: 2001
  ident: ref11
  article-title: Automatic glottal closed-phase location and analyis by Kalman filtering
  publication-title: Proc 4th ISCA Tutorial Res Workshop Speech Synth
– year: 1976
  ident: ref10
  publication-title: Linear Prediction of Speech
  doi: 10.1007/978-3-642-66286-7
– start-page: 4517
  year: 2009
  ident: ref17
  article-title: Glottal closure instant detection using lines of maximum amplitudes (LOMA) of the wavelet transform
  publication-title: Proc IEEE Int Conf Acoust Speech Signal Process (ICASSP)
– year: 2004
  ident: ref26
  publication-title: Speech Filing System Tools for Speech
– ident: ref39
  doi: 10.1109/18.761341
– ident: ref21
  doi: 10.1155/2007/62521
SSID ssj0043641
Score 2.3861797
Snippet Accurate estimation of glottal closing instants (GCIs) and opening instants (GOIs) is important for speech processing applications that benefit from...
SourceID pascalfrancis
crossref
ieee
SourceType Index Database
Enrichment Source
Publisher
StartPage 82
SubjectTerms Adaptation model
Algorithm design and analysis
Applied sciences
Delay
Dynamic programming
electroglottograph (EGG)
Estimation
Exact sciences and technology
glottal closing instants (GCIs)
glottal opening instants (GOIs)
group delay function
Heuristic algorithms
Information, signal and communications theory
multiscale analysis
Signal processing
Speech
Speech processing
Telecommunications and information theory
Title Estimation of Glottal Closing and Opening Instants in Voiced Speech Using the YAGA Algorithm
URI https://ieeexplore.ieee.org/document/5784321
Volume 20
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE/IET Electronic Library (IEL)
  customDbUrl:
  eissn: 1558-7924
  dateEnd: 20131231
  omitProxy: false
  ssIdentifier: ssj0043641
  issn: 1558-7916
  databaseCode: RIE
  dateStart: 20060101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8QwEA6rJz34FtfHkoMnsWvadLPtsYhP1IsPFISSx0TFtV3W7sVfb6bpLioi3gJtSsiXTr5kZr4hZFcqpWMmRRBbYIFbFGEgeR8C2w-ljLi0XKFH9_JKnN7G5_e9-xbZn-bCAEAdfAZdbNa-fFPqMV6VucN7EnPMGp_pJ8Lnak2sbsxF7LVRewlKMIrGgxmy9OAmu77wYp1uf0PH07c9qC6qgiGR8t3NivXlLL7sMceL5HIyOh9a8todV6qrP34IN_53-EtkoSGbNPOrY5m0oFgh818kCFfJ45H7x336Ii0tPRmUlaPj9HBQ4iUClYWhGHOC7bOaSVbv9KWgd6UzMIZeDwH0M63jDqijkvQhO8loNngqRy_V89sauT0-ujk8DZqKC4HmIqwC0DHvGxUyCYkyPZnEEUCoEmVBuba2jt4JDhFDFTyTskg5mIUzlzYRzDjjsU5mi7KADUK5klEUmkQrrWvfHB6MGFOpTYU0jLcJm2CQ60aOHKtiDPL6WMLSHGHLEba8ga1N9qZdhl6L46-XVxGB6YvN5LdJ5xvQ0-cRFmF23HHz935bZM59PfKXL9tkthqNYcfRkUp16nX4CULJ2xY
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwEB5V5QAceBXE8ig-cEJkcWLHmxyjqu0WdnvpFhUJKfJj3FYsSdVmL_x6PHF21SKEuFlKrCT-nPFnz8w3AO-1MVZyrRLpkSdhUqSJFhNM_CTVOhPaC0Me3fmxmp7Kz2f52RZ83OTCIGIffIZjava-fNfaFR2Vhc17IQVljd_LpZR5zNZa210plIzqqHlBIoxq8GGmvPy0qE5mUa4zrHDkerqzCvVlVSgoUt-EcfGxoMWtVebgMczX7xeDS36MV50Z219_SDf-7wc8gUcD3WRVnB9PYQubZ_DwlgjhDnzfD395TGBkrWeHy7YLhJztLVs6RmC6cYyiTqh91HPJ7oZdNuxrG0yMYydXiPaC9ZEHLJBJ9q06rFi1PG-vL7uLn8_h9GB_sTdNhpoLiRUq7RK0UkycSbnGwrhcFzJDTE1hPJrQtj4QPCUw46SD50qemQC0CgbTF4q7YD5ewHbTNvgSmDA6y1JXWGNt752jrRHnpvSl0o6LEfA1BrUdBMmpLsay7jcmvKwJtppgqwfYRvBh0-UqqnH86-YdQmBz4zD4I9i9A_TmekZlmAN7fPX3fu_g_nQxn9Wzo-Mvr-FBeFIWj2LewHZ3vcK3gZx0Zrefk78B5e3eYw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Estimation+of+Glottal+Closing+and+Opening+Instants+in+Voiced+Speech+Using+the+YAGA+Algorithm&rft.jtitle=IEEE+transactions+on+audio%2C+speech%2C+and+language+processing&rft.au=THOMAS%2C+Mark+R.+P&rft.au=GUDNASON%2C+Jon&rft.au=NAYLOR%2C+Patrick+A&rft.date=2012&rft.pub=Institute+of+Electrical+and+Electronics+Engineers&rft.issn=1558-7916&rft.volume=20&rft.issue=1&rft.spage=82&rft.epage=91&rft_id=info:doi/10.1109%2Ftasl.2011.2157684&rft.externalDBID=n%2Fa&rft.externalDocID=25473435
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1558-7916&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1558-7916&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1558-7916&client=summon