Estimation of Glottal Closing and Opening Instants in Voiced Speech Using the YAGA Algorithm
Accurate estimation of glottal closing instants (GCIs) and opening instants (GOIs) is important for speech processing applications that benefit from glottal-synchronous processing including pitch tracking, prosodic speech modification, speech dereverberation, synthesis and study of pathological voic...
Saved in:
| Published in | IEEE transactions on audio, speech, and language processing Vol. 20; no. 1; pp. 82 - 91 |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
Piscataway, NJ
IEEE
01.01.2012
Institute of Electrical and Electronics Engineers |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1558-7916 1558-7924 |
| DOI | 10.1109/TASL.2011.2157684 |
Cover
| Abstract | Accurate estimation of glottal closing instants (GCIs) and opening instants (GOIs) is important for speech processing applications that benefit from glottal-synchronous processing including pitch tracking, prosodic speech modification, speech dereverberation, synthesis and study of pathological voice. We propose the Yet Another GCI/GOI Algorithm (YAGA) to detect GCIs from speech signals by employing multiscale analysis, the group delay function, and N -best dynamic programming. A novel GOI detector based upon the consistency of the candidates' closed quotients relative to the estimated GCIs is also presented. Particular attention is paid to the precise definition of the glottal closed phase, which we define as the analysis interval that produces minimum deviation from an all-pole model of the speech signal with closed-phase linear prediction (LP). A reference algorithm analyzing both electroglottograph (EGG) and speech signals is described for evaluation of the proposed speech-based algorithm. In addition to the development of a GCI/GOI detector, an important outcome of this work is in demonstrating that GOIs derived from the EGG signal are not necessarily well-suited to closed-phase LP analysis. Evaluation of YAGA against the APLAWD and SAM databases show that GCI identification rates of up to 99.3% can be achieved with an accuracy of 0.3 ms and GOI detection can be achieved equally reliably with an accuracy of 0.5 ms. |
|---|---|
| AbstractList | Accurate estimation of glottal closing instants (GCIs) and opening instants (GOIs) is important for speech processing applications that benefit from glottal-synchronous processing including pitch tracking, prosodic speech modification, speech dereverberation, synthesis and study of pathological voice. We propose the Yet Another GCI/GOI Algorithm (YAGA) to detect GCIs from speech signals by employing multiscale analysis, the group delay function, and N -best dynamic programming. A novel GOI detector based upon the consistency of the candidates' closed quotients relative to the estimated GCIs is also presented. Particular attention is paid to the precise definition of the glottal closed phase, which we define as the analysis interval that produces minimum deviation from an all-pole model of the speech signal with closed-phase linear prediction (LP). A reference algorithm analyzing both electroglottograph (EGG) and speech signals is described for evaluation of the proposed speech-based algorithm. In addition to the development of a GCI/GOI detector, an important outcome of this work is in demonstrating that GOIs derived from the EGG signal are not necessarily well-suited to closed-phase LP analysis. Evaluation of YAGA against the APLAWD and SAM databases show that GCI identification rates of up to 99.3% can be achieved with an accuracy of 0.3 ms and GOI detection can be achieved equally reliably with an accuracy of 0.5 ms. |
| Author | Gudnason, J. Thomas, M. R. P. Naylor, P. A. |
| Author_xml | – sequence: 1 givenname: M. R. P. surname: Thomas fullname: Thomas, M. R. P. email: mrt102@imperial.ac.uk organization: Electr. & Electron. Eng. Dept., Imperial Coll., London, UK – sequence: 2 givenname: J. surname: Gudnason fullname: Gudnason, J. email: jg@ru.is organization: Sch. of Sci. & Eng., Reykjavik Univ., Reykjavik, Iceland – sequence: 3 givenname: P. A. surname: Naylor fullname: Naylor, P. A. email: p.naylor@imperial.ac.uk organization: Electr. & Electron. Eng. Dept., Imperial Coll., London, UK |
| BackLink | http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=25473435$$DView record in Pascal Francis |
| BookMark | eNp9kEFrwkAUhJdioWr7A0ove-kxdl-y2Y3HINYKgge1UCiEzeZFt8RNyO6l_76JWg899PQG3jcDMyMysLVFQh6BTQDY9GWbblaTkAFMQoilSPgNGUIcJ4Gchnxw1SDuyMi5L8Z4JDgMyefceXNU3tSW1iVdVLX3qqKzqnbG7qmyBV03aHu9tM4r6x01lr7XRmNBNw2iPtDdifUHpB_pIqVpta9b4w_He3Jbqsrhw-WOye51vp29Bav1YjlLV4GOBPgANY9kkQNTmORFrBIeIkKe5CXmndYlZ7GIMGSJkLKYsjBnSggAKBPBCsmiMXk-5zbKaVWVrbLauKxpu2btdxbGXEY8ijtOnjnd1s61WGba-FN33ypTZcCyfsysHzPrx8wuY3ZO-OP8Df_P83T2GES88rHsXiFEP4u_gUI |
| CODEN | ITASD8 |
| CitedBy_id | crossref_primary_10_1109_TASL_2011_2170835 crossref_primary_10_1007_s00034_020_01373_2 crossref_primary_10_1109_TASL_2013_2273717 crossref_primary_10_1109_TASLP_2020_3044168 crossref_primary_10_1016_j_specom_2012_08_011 crossref_primary_10_1007_s00034_017_0654_y crossref_primary_10_1007_s00034_018_0804_x crossref_primary_10_1007_s00034_017_0713_4 crossref_primary_10_1109_TASLP_2017_2714839 crossref_primary_10_1016_j_csl_2017_05_008 crossref_primary_10_1109_LSP_2019_2929442 crossref_primary_10_1134_S1063771016020135 crossref_primary_10_1007_s10772_023_10060_x crossref_primary_10_1134_S105466181602022X crossref_primary_10_1007_s00034_020_01551_2 crossref_primary_10_1007_s00034_015_0166_6 crossref_primary_10_1016_j_specom_2017_12_002 crossref_primary_10_1109_TASLP_2018_2873897 crossref_primary_10_1007_s00034_014_9957_4 crossref_primary_10_1016_j_eswa_2022_116597 crossref_primary_10_1109_TASL_2013_2255275 crossref_primary_10_1109_TASLP_2021_3120585 crossref_primary_10_1016_j_csl_2014_03_003 crossref_primary_10_1007_s10772_020_09752_5 crossref_primary_10_1016_j_specom_2023_103006 crossref_primary_10_1109_TASLP_2018_2834733 crossref_primary_10_1109_TASLP_2015_2506263 crossref_primary_10_1016_j_bspc_2017_03_007 crossref_primary_10_1016_j_jvoice_2018_01_003 crossref_primary_10_1016_j_specom_2016_11_005 crossref_primary_10_1186_1687_4722_2013_3 crossref_primary_10_1109_TBME_2014_2318774 crossref_primary_10_1016_j_csl_2020_101097 crossref_primary_10_1007_s41870_024_02138_9 crossref_primary_10_1007_s10772_015_9316_2 crossref_primary_10_1016_j_specom_2019_11_004 crossref_primary_10_1109_LSP_2019_2921229 crossref_primary_10_1007_s00034_017_0582_x crossref_primary_10_1109_TASLP_2017_2651391 crossref_primary_10_1016_j_specom_2016_12_004 crossref_primary_10_1016_j_sigpro_2013_07_029 crossref_primary_10_1121_1_4958681 crossref_primary_10_1121_1_5139225 crossref_primary_10_2139_ssrn_4146180 crossref_primary_10_1109_TASLP_2014_2352451 crossref_primary_10_1007_s10772_024_10137_1 crossref_primary_10_1109_LSP_2016_2519500 crossref_primary_10_1007_s00034_023_02312_7 crossref_primary_10_1016_j_dsp_2017_07_006 crossref_primary_10_1007_s10772_016_9383_z crossref_primary_10_3390_sym16070788 crossref_primary_10_1109_ACCESS_2024_3454825 crossref_primary_10_1109_JPROC_2021_3126493 crossref_primary_10_1007_s10772_021_09810_6 crossref_primary_10_1016_j_csl_2022_101443 |
| Cites_doi | 10.1109/ICASSP.1995.479809 10.1109/TASSP.1986.1164909 10.1109/89.784109 10.1109/NORSIG.2006.275243 10.1109/LSP.2007.896454 10.1016/0167-6393(90)90021-Z 10.3109/02699208908985291 10.15837/ijccc.2008.1.2371 10.1109/ICASSP.1990.115542 10.1109/ICASSP.1985.1168147 10.1109/ICASSP.1987.1169874 10.1109/18.119752 10.1109/TSA.2005.857810 10.1109/TASL.2009.2022430 10.1109/ICTTA.2008.4530031 10.1109/TASL.2008.2012194 10.1016/0167-6393(92)90005-R 10.1109/ICASSP.2009.4960453 10.1109/18.119727 10.1109/TASSP.1974.1162572 10.1121/1.1912389 10.1109/ISCCSP.2004.1296465 10.1109/89.279274 10.1109/TASSP.1979.1163260 10.1044/jshr.3103.338 10.1109/TASL.2006.876878 10.1121/1.1646401 10.1121/1.1903487 10.1007/978-3-642-66286-7 10.1109/18.761341 10.1155/2007/62521 |
| ContentType | Journal Article |
| Copyright | 2015 INIST-CNRS |
| Copyright_xml | – notice: 2015 INIST-CNRS |
| DBID | 97E RIA RIE AAYXX CITATION IQODW |
| DOI | 10.1109/TASL.2011.2157684 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Pascal-Francis |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Applied Sciences |
| EISSN | 1558-7924 |
| EndPage | 91 |
| ExternalDocumentID | 25473435 10_1109_TASL_2011_2157684 5784321 |
| Genre | orig-research |
| GroupedDBID | 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AASAJ AAWTH ABAZT ABQJQ ABVLG AETIX AGQYO AGSQL AHBIQ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD F5P HZ~ IFIPE IPLJI JAVBF LAI M43 O9- OCL RIA RIE RNS AAYXX CITATION IQODW RIG |
| ID | FETCH-LOGICAL-c361t-ec437db10ae8bd5a842ee1b8bfeb842cf40563e208677d902b0a66111f860d703 |
| IEDL.DBID | RIE |
| ISSN | 1558-7916 |
| IngestDate | Mon Jul 21 09:14:21 EDT 2025 Thu Apr 24 23:12:38 EDT 2025 Wed Oct 01 01:44:53 EDT 2025 Tue Aug 26 17:18:10 EDT 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Keywords | glottal opening instants (GOIs) multiscale analysis Acoustic signal Speech synthesis glottal closing instants (GCIs) Accuracy Prosody Vocal signal group delay function speech processing Dynamic programming Pitch(acoustics) electroglottograph (EGG) Group delay |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html CC BY 4.0 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c361t-ec437db10ae8bd5a842ee1b8bfeb842cf40563e208677d902b0a66111f860d703 |
| PageCount | 10 |
| ParticipantIDs | ieee_primary_5784321 crossref_citationtrail_10_1109_TASL_2011_2157684 pascalfrancis_primary_25473435 crossref_primary_10_1109_TASL_2011_2157684 |
| ProviderPackageCode | CITATION AAYXX |
| PublicationCentury | 2000 |
| PublicationDate | 2012-Jan. 2012-01-00 2012 |
| PublicationDateYYYYMMDD | 2012-01-01 |
| PublicationDate_xml | – month: 01 year: 2012 text: 2012-Jan. |
| PublicationDecade | 2010 |
| PublicationPlace | Piscataway, NJ |
| PublicationPlace_xml | – name: Piscataway, NJ |
| PublicationTitle | IEEE transactions on audio, speech, and language processing |
| PublicationTitleAbbrev | TASL |
| PublicationYear | 2012 |
| Publisher | IEEE Institute of Electrical and Electronics Engineers |
| Publisher_xml | – name: IEEE – name: Institute of Electrical and Electronics Engineers |
| References | ref35 ref13 huckvale (ref26) 2004 ref34 ref37 ref15 ref14 ref31 ref30 ref32 markel (ref10) 1976 ref1 ref39 mckenna (ref11) 2001 ref38 ref16 ref18 davies (ref6) 1986; 8 thomas (ref3) 2009 chan (ref43) 1995 brookes (ref27) 2006; 14 gaubitch (ref2) 2008 ref24 ref23 ref25 ref20 ref22 ref21 lindsey (ref42) 1987 bouzid (ref19) 2008; iii naylor (ref12) 2007; 15 ref28 scherer (ref7) 1995 ref29 ref8 fant (ref33) 1985; 26 chan (ref36) 1989; 33 ref9 ref4 sturmel (ref17) 2009 ref5 kawahara (ref41) 2000; 4 ref40 |
| References_xml | – ident: ref28 doi: 10.1109/ICASSP.1995.479809 – ident: ref8 doi: 10.1109/TASSP.1986.1164909 – ident: ref9 doi: 10.1109/89.784109 – ident: ref13 doi: 10.1109/NORSIG.2006.275243 – year: 1987 ident: ref42 publication-title: SPAR's archivable actual-word databases – volume: 4 start-page: 664 year: 2000 ident: ref41 article-title: Accurate vocal event detection method based on a fixed-point analysis of mapping from time to weighted average group delay publication-title: Proc Int Conf Spoken Lang Process (ICSLP) – ident: ref14 doi: 10.1109/LSP.2007.896454 – ident: ref1 doi: 10.1016/0167-6393(90)90021-Z – volume: 8 start-page: 539 year: 1986 ident: ref6 article-title: Variation of glottal open and closed phases for speakers of English publication-title: Proc Inst Acoust – ident: ref37 doi: 10.3109/02699208908985291 – volume: iii start-page: 21 year: 2008 ident: ref19 article-title: Electroglottographic measures based on gci and goi detection using multiscale product publication-title: Int J Comput Commun Control doi: 10.15837/ijccc.2008.1.2371 – volume: 26 start-page: 1 year: 1985 ident: ref33 article-title: A four-parameter model of glottal flow publication-title: STL-QPSR – ident: ref29 doi: 10.1109/ICASSP.1990.115542 – ident: ref32 doi: 10.1109/ICASSP.1985.1168147 – ident: ref30 doi: 10.1109/ICASSP.1987.1169874 – ident: ref16 doi: 10.1109/18.119752 – start-page: 3965 year: 2009 ident: ref3 article-title: Data-driven voice source waveform modeling publication-title: Proc IEEE Int Conf Acoust Speech Signal Process (ICASSP) – volume: 14 start-page: 456 year: 2006 ident: ref27 article-title: A quantitative assessment of group delay methods for identifying glottal closures in voiced speech publication-title: IEEE Trans Speech Audio Process doi: 10.1109/TSA.2005.857810 – start-page: 867 year: 1995 ident: ref43 article-title: EUROM-A spoken language resource for the EU publication-title: Proc Eur Conf Speech Commun Technol – year: 2008 ident: ref2 article-title: Multi-microphone speech dereverberation using spatio-temporal and spectral processing publication-title: Proc Int Symp Circuits Syst – ident: ref20 doi: 10.1109/TASL.2009.2022430 – ident: ref22 doi: 10.1109/ICTTA.2008.4530031 – ident: ref24 doi: 10.1109/TASL.2008.2012194 – ident: ref35 doi: 10.1016/0167-6393(92)90005-R – ident: ref4 doi: 10.1109/ICASSP.2009.4960453 – ident: ref18 doi: 10.1109/18.119727 – ident: ref31 doi: 10.1109/TASSP.1974.1162572 – ident: ref34 doi: 10.1121/1.1912389 – volume: 33 year: 1989 ident: ref36 article-title: Variability of excitation parameters derived from robust closed phase glottal inverse filtering publication-title: Proc Eur Conf Speech Commun Technol – ident: ref25 doi: 10.1109/ISCCSP.2004.1296465 – start-page: 269 year: 1995 ident: ref7 publication-title: Producing Speech Contemporary Issues for Katherine Safford Harris – ident: ref15 doi: 10.1109/89.279274 – ident: ref5 doi: 10.1109/TASSP.1979.1163260 – ident: ref38 doi: 10.1044/jshr.3103.338 – volume: 15 start-page: 34 year: 2007 ident: ref12 article-title: Estimation of glottal closure instants in voiced speech using the DYPSA algorithm publication-title: IEEE Trans Speech Audio Process doi: 10.1109/TASL.2006.876878 – ident: ref40 doi: 10.1121/1.1646401 – ident: ref23 doi: 10.1121/1.1903487 – year: 2001 ident: ref11 article-title: Automatic glottal closed-phase location and analyis by Kalman filtering publication-title: Proc 4th ISCA Tutorial Res Workshop Speech Synth – year: 1976 ident: ref10 publication-title: Linear Prediction of Speech doi: 10.1007/978-3-642-66286-7 – start-page: 4517 year: 2009 ident: ref17 article-title: Glottal closure instant detection using lines of maximum amplitudes (LOMA) of the wavelet transform publication-title: Proc IEEE Int Conf Acoust Speech Signal Process (ICASSP) – year: 2004 ident: ref26 publication-title: Speech Filing System Tools for Speech – ident: ref39 doi: 10.1109/18.761341 – ident: ref21 doi: 10.1155/2007/62521 |
| SSID | ssj0043641 |
| Score | 2.3861797 |
| Snippet | Accurate estimation of glottal closing instants (GCIs) and opening instants (GOIs) is important for speech processing applications that benefit from... |
| SourceID | pascalfrancis crossref ieee |
| SourceType | Index Database Enrichment Source Publisher |
| StartPage | 82 |
| SubjectTerms | Adaptation model Algorithm design and analysis Applied sciences Delay Dynamic programming electroglottograph (EGG) Estimation Exact sciences and technology glottal closing instants (GCIs) glottal opening instants (GOIs) group delay function Heuristic algorithms Information, signal and communications theory multiscale analysis Signal processing Speech Speech processing Telecommunications and information theory |
| Title | Estimation of Glottal Closing and Opening Instants in Voiced Speech Using the YAGA Algorithm |
| URI | https://ieeexplore.ieee.org/document/5784321 |
| Volume | 20 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE/IET Electronic Library (IEL) customDbUrl: eissn: 1558-7924 dateEnd: 20131231 omitProxy: false ssIdentifier: ssj0043641 issn: 1558-7916 databaseCode: RIE dateStart: 20060101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8QwEA6rJz34FtfHkoMnsWvadLPtsYhP1IsPFISSx0TFtV3W7sVfb6bpLioi3gJtSsiXTr5kZr4hZFcqpWMmRRBbYIFbFGEgeR8C2w-ljLi0XKFH9_JKnN7G5_e9-xbZn-bCAEAdfAZdbNa-fFPqMV6VucN7EnPMGp_pJ8Lnak2sbsxF7LVRewlKMIrGgxmy9OAmu77wYp1uf0PH07c9qC6qgiGR8t3NivXlLL7sMceL5HIyOh9a8todV6qrP34IN_53-EtkoSGbNPOrY5m0oFgh818kCFfJ45H7x336Ii0tPRmUlaPj9HBQ4iUClYWhGHOC7bOaSVbv9KWgd6UzMIZeDwH0M63jDqijkvQhO8loNngqRy_V89sauT0-ujk8DZqKC4HmIqwC0DHvGxUyCYkyPZnEEUCoEmVBuba2jt4JDhFDFTyTskg5mIUzlzYRzDjjsU5mi7KADUK5klEUmkQrrWvfHB6MGFOpTYU0jLcJm2CQ60aOHKtiDPL6WMLSHGHLEba8ga1N9qZdhl6L46-XVxGB6YvN5LdJ5xvQ0-cRFmF23HHz935bZM59PfKXL9tkthqNYcfRkUp16nX4CULJ2xY |
| linkProvider | IEEE |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwEB5V5QAceBXE8ig-cEJkcWLHmxyjqu0WdnvpFhUJKfJj3FYsSdVmL_x6PHF21SKEuFlKrCT-nPFnz8w3AO-1MVZyrRLpkSdhUqSJFhNM_CTVOhPaC0Me3fmxmp7Kz2f52RZ83OTCIGIffIZjava-fNfaFR2Vhc17IQVljd_LpZR5zNZa210plIzqqHlBIoxq8GGmvPy0qE5mUa4zrHDkerqzCvVlVSgoUt-EcfGxoMWtVebgMczX7xeDS36MV50Z219_SDf-7wc8gUcD3WRVnB9PYQubZ_DwlgjhDnzfD395TGBkrWeHy7YLhJztLVs6RmC6cYyiTqh91HPJ7oZdNuxrG0yMYydXiPaC9ZEHLJBJ9q06rFi1PG-vL7uLn8_h9GB_sTdNhpoLiRUq7RK0UkycSbnGwrhcFzJDTE1hPJrQtj4QPCUw46SD50qemQC0CgbTF4q7YD5ewHbTNvgSmDA6y1JXWGNt752jrRHnpvSl0o6LEfA1BrUdBMmpLsay7jcmvKwJtppgqwfYRvBh0-UqqnH86-YdQmBz4zD4I9i9A_TmekZlmAN7fPX3fu_g_nQxn9Wzo-Mvr-FBeFIWj2LewHZ3vcK3gZx0Zrefk78B5e3eYw |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Estimation+of+Glottal+Closing+and+Opening+Instants+in+Voiced+Speech+Using+the+YAGA+Algorithm&rft.jtitle=IEEE+transactions+on+audio%2C+speech%2C+and+language+processing&rft.au=THOMAS%2C+Mark+R.+P&rft.au=GUDNASON%2C+Jon&rft.au=NAYLOR%2C+Patrick+A&rft.date=2012&rft.pub=Institute+of+Electrical+and+Electronics+Engineers&rft.issn=1558-7916&rft.volume=20&rft.issue=1&rft.spage=82&rft.epage=91&rft_id=info:doi/10.1109%2Ftasl.2011.2157684&rft.externalDBID=n%2Fa&rft.externalDocID=25473435 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1558-7916&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1558-7916&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1558-7916&client=summon |