Spoofing speech detection using temporal convolutional neural network
Spoofing speech detection aims to differentiate spoofing speech from natural speech. Frame-based features are usually used in most of previous works. Although multiple frames or dynamic features are used to form a super-vector to represent the temporal information, the time span covered by these fea...
Saved in:
| Published in | 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) pp. 1 - 6 |
|---|---|
| Main Authors | , , , |
| Format | Conference Proceeding |
| Language | English |
| Published |
Asia Pacific Signal and Information Processing Association
01.12.2016
|
| Subjects | |
| Online Access | Get full text |
| DOI | 10.1109/APSIPA.2016.7820738 |
Cover
| Abstract | Spoofing speech detection aims to differentiate spoofing speech from natural speech. Frame-based features are usually used in most of previous works. Although multiple frames or dynamic features are used to form a super-vector to represent the temporal information, the time span covered by these features are not sufficient. Most of the systems failed to detect the non-vocoder or unit selection based spoofing attacks. In this work, we propose to use a temporal convolutional neural network (CNN) based classifier for spoofing speech detection. The temporal CNN first convolves the feature trajectories with a set of filters, then extract the maximum responses of these filters within a time window using a max-pooling layer. Due to the use of max-pooling, we can extract useful information from a long temporal span without concatenating a large number of neighbouring frames, as in feedforward deep neural network (DNN). Five types of feature are employed to access the performance of proposed classifier. Experimental results on ASVspoof 2015 corpus show that the temporal CNN based classifier is effective for synthetic speech detection. Specifically, the proposed method brings a significant performance boost for the unit selection based spoofing speech detection. |
|---|---|
| AbstractList | Spoofing speech detection aims to differentiate spoofing speech from natural speech. Frame-based features are usually used in most of previous works. Although multiple frames or dynamic features are used to form a super-vector to represent the temporal information, the time span covered by these features are not sufficient. Most of the systems failed to detect the non-vocoder or unit selection based spoofing attacks. In this work, we propose to use a temporal convolutional neural network (CNN) based classifier for spoofing speech detection. The temporal CNN first convolves the feature trajectories with a set of filters, then extract the maximum responses of these filters within a time window using a max-pooling layer. Due to the use of max-pooling, we can extract useful information from a long temporal span without concatenating a large number of neighbouring frames, as in feedforward deep neural network (DNN). Five types of feature are employed to access the performance of proposed classifier. Experimental results on ASVspoof 2015 corpus show that the temporal CNN based classifier is effective for synthetic speech detection. Specifically, the proposed method brings a significant performance boost for the unit selection based spoofing speech detection. |
| Author | Xiong Xiao Haizhou Li Eng Siong Chng Xiaohai Tian |
| Author_xml | – sequence: 1 surname: Xiaohai Tian fullname: Xiaohai Tian email: xhtian@ntu.edu.sg organization: Sch. of Comput. Sci. & Eng., Nanyang Technol. Univ., Singapore, Singapore – sequence: 2 surname: Xiong Xiao fullname: Xiong Xiao email: xiaoxiong@ntu.edu.sg organization: Temasek Labs., NTU, Singapore, Singapore – sequence: 3 surname: Eng Siong Chng fullname: Eng Siong Chng email: ASESChng@ntu.edu.sg organization: Sch. of Comput. Sci. & Eng., Nanyang Technol. Univ., Singapore, Singapore – sequence: 4 surname: Haizhou Li fullname: Haizhou Li email: eleliha@nus.edu.sg organization: Sch. of Comput. Sci. & Eng., Nanyang Technol. Univ., Singapore, Singapore |
| BookMark | eNo9j11LwzAUhiPohZv-gt30D7QmTZcml2VMHQwcTK9DmpxosEtCPxz797bb9OrlPQ_PgXeGbn3wgNCC4IwQLJ6q3X6zq7IcE5aVPMcl5TdoJjgnRcl4zu_Reh9DsM5_Jl0E0F-JgR5074JPhm4693CIoVVNooP_Cc0wobF5GNpz9MfQfj-gO6uaDh6vOUcfz-v31Wu6fXvZrKpt6ghd8rSw2ii1NLXgDFurCceKci0UrzWhtrQginJEwI1g1OSGaZPrmhkr6tpaReeouPwdfFSno2oaGVt3UO1JEiynyVLFzkUlp8nyOnnUFhfNAcC_8Ud_AdkcXDg |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL ADTOC UNPAY |
| DOI | 10.1109/APSIPA.2016.7820738 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEL IEEE Proceedings Order Plans (POP All) 1998-Present Unpaywall for CDI: Periodical Content Unpaywall |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher – sequence: 2 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9881476828 9789881476821 |
| EndPage | 6 |
| ExternalDocumentID | oai:dr.ntu.edu.sg:10356/89639 7820738 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IL CBEJK RIE RIL ADTOC UNPAY |
| ID | FETCH-LOGICAL-i1358-4fcdaa5db9860ffc180a38c9a8bc13f7fe947860e8d963d2d6cd2cb6df9bbffa3 |
| IEDL.DBID | UNPAY |
| IngestDate | Sun Oct 26 03:59:45 EDT 2025 Thu Jun 29 18:38:22 EDT 2023 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i1358-4fcdaa5db9860ffc180a38c9a8bc13f7fe947860e8d963d2d6cd2cb6df9bbffa3 |
| OpenAccessLink | https://proxy.k.utb.cz/login?url=https://dr.ntu.edu.sg/bitstream/10356/89639/1/APSIPA_CNN_ASV.pdf |
| PageCount | 6 |
| ParticipantIDs | unpaywall_primary_10_1109_apsipa_2016_7820738 ieee_primary_7820738 |
| PublicationCentury | 2000 |
| PublicationDate | 2016-Dec. |
| PublicationDateYYYYMMDD | 2016-12-01 |
| PublicationDate_xml | – month: 12 year: 2016 text: 2016-Dec. |
| PublicationDecade | 2010 |
| PublicationTitle | 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) |
| PublicationTitleAbbrev | APSIPA |
| PublicationYear | 2016 |
| Publisher | Asia Pacific Signal and Information Processing Association |
| Publisher_xml | – name: Asia Pacific Signal and Information Processing Association |
| Score | 1.721207 |
| Snippet | Spoofing speech detection aims to differentiate spoofing speech from natural speech. Frame-based features are usually used in most of previous works. Although... |
| SourceID | unpaywall ieee |
| SourceType | Open Access Repository Publisher |
| StartPage | 1 |
| SubjectTerms | Convolution Feature extraction Natural languages Neural networks Speech Trajectory Vocoders |
| SummonAdditionalLinks | – databaseName: IEL dbid: RIE link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8NAEF7aXtSLSivWFzl4NOkmu93HsUhLFSqFWugt7LMWSxpsg-ivN5ukUcSDpyRk2E1mJgy7-b5vALiFnEqVx9nXGFkf00j70pHdKSdaQ8E1VI47PHki4zl-XPQXDXBXc2GMMQX4zATutPiXrzcqc1tlPaftRhFrgiZlpORqVUJCIeS9wXT2MB04tBYJKsuqZcoROMiSVHy8i_X6R_UYHYPJft4SNPIaZDsZqM9fkoz_fbAT0Pnm6XnTugKdgoZJ2mA4Szd5xiRLb5sao148bXYF3irxHMh96VVqVGvPIc6rzMuvnLJlcShw4R0wHw2f78d-1SzBX4Woz3xslRairyVnBFqrQgYFYooLJlWILLWG49xr0DCdf3M60kTpSEmiLZfSWoHOQCvZJOYceJJLRPMyHmJpsaFY5EskFSmlXQtcRGAXtJ0L4rTUw4irt-8Cv_Zyfa9YY0Aei3S7SkXsgrO3v_h7mEtw6KxKyMgVaO3eMnOdF_6dvCki_gX1HbGx priority: 102 providerName: IEEE |
| Title | Spoofing speech detection using temporal convolutional neural network |
| URI | https://ieeexplore.ieee.org/document/7820738 https://dr.ntu.edu.sg/bitstream/10356/89639/1/APSIPA_CNN_ASV.pdf |
| UnpaywallVersion | submittedVersion |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3PS8MwFA66HcSLyiZOdPTgtW3atGlzLGNjCo7CnMxTyc85nF3ZOkT_epOum8ObeAohOTzyXvKS8H3fA-AOkohx7WdbBEjZQeQLmxmye0SwEJASAbnhDj-O8HASPEzD6QGLX6wcfdZWHxXrmcvmpaFM0He9uVGI3ViHC3E9N0nH92mS9UajLBk_O4VQx6CJQ30bb4DmZJQmL7XAkAeJS4v1vDASQx52jDpcxUWpSqmcgpNNXtDPD7pYHGSVwRmgO3u2YJI3Z1Myh3_9kmr8j8HnoP3D6bPSfba6AEcyb4H-uFjq6Mpn1rqQkr9aQpYVNiu3DCB-ZtXKVQvLoNPrKNU9o4JZNRWGvA0mg_5Tb2jXhRXsuYfC2A4UF5SGgpEYQ6W4F0OKYk5ozLiHVKQkCSI9JGOh7Re-wFz4nGGhCGNKUXQJGvkyl1fAYoShSKd8L2AqkFFA9XOK-5wLUy4XYdgBLbPSWbHVzshqB3SAvV_5_Vj1HoEk2zosMw7bzb_-4_wb0ChXG3mrLwwl61asvm4dGd_DisVO |
| linkProvider | Unpaywall |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1bT8IwFD5BfEBf1IARr3vw0Y1duksfiYGAAiEBEt6WXpFIxiJbjP56121MY3zwqWvadW1Pm7Nu3_cdgHsT-5RldtY5cqSOfJvrVJHdfexxbhLMTaa4w-OJN1igp6W7rMFDxYURQuTgM2Goy_xfPt-yVH0q6yhtN98JDuDQRQi5BVurlBKyTNzpTmfDaVfhtTyjrFsGTTmGRhrF5OOdbDY__Ef_BMb7JxewkVcjTajBPn-JMv63a6fQ-mbqadPKB51BTURN6M3ibbZmopW2i4VgLxoXSY64ijQFc19ppR7VRlOY83LtZTmlbZknOTK8BYt-b_440MtwCfractxAR5JxQlxOceCZUjIrMIkTMEwCyixH-lJg5GdFIuDZruM29xi3GfW4xJRKSZxzqEfbSFyARjF1_MyRW4hKJHxEskMSsxnjKgiu45ltaKopCONCESMsR98GvZrlqiw_ZZg4JPFuHZNQGWdf__LvZu6gMZiPR-FoOHm-giN1RwEguYZ68paKm-w1IKG3ufW_AGx4tP4 |
| linkToUnpaywall | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3PS8MwFA66HcSLyiYqKj14bZu2adocy9iYgqUwJ_NU8nMOZ1e2DtG_3qTr5vAmnkJIDo-8l7wkfN_3ALiDJGJc-9kWKFA2inxhM0N2jwgWAlIiIDfc4ccUD8foYRJO9lj8Yunos7b-qFhNXTarDGWCvuvNHYTYjXW4ENdzk2x0nyV5L03zZPTslEIdgjYO9W28BdrjNEteGoEhDxKXlqtZaSSGPOwYdbiai1KXUjkGR-uipJ8fdD7fyyqDE0C39mzAJG_OumIO__ol1fgfg09B94fTZ2W7bHUGDmTRAf1RudDRVUytVSklf7WErGpsVmEZQPzUapSr5pZBpzdRqntGBbNuagx5F4wH_afe0G4KK9gzLwhjGykuKA0FIzGGSnEvhjSIOaEx416gIiUJivSQjIW2X_gCc-FzhoUijClFg3PQKhaFvAAWIyyIdMr3EFNIRojq5xT3ORemXG6A4SXomJXOy412Rt444BLYu5XfjdXvEUjyjcNy47Dt_Ks_zr8GrWq5ljf6wlCx2yYmvgEpC8RN |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2016+Asia-Pacific+Signal+and+Information+Processing+Association+Annual+Summit+and+Conference+%28APSIPA%29&rft.atitle=Spoofing+speech+detection+using+temporal+convolutional+neural+network&rft.au=Xiaohai+Tian&rft.au=Xiong+Xiao&rft.au=Eng+Siong+Chng&rft.au=Haizhou+Li&rft.date=2016-12-01&rft.pub=Asia+Pacific+Signal+and+Information+Processing+Association&rft.spage=1&rft.epage=6&rft_id=info:doi/10.1109%2FAPSIPA.2016.7820738&rft.externalDocID=7820738 |