Spoofing speech detection using temporal convolutional neural network

Spoofing speech detection aims to differentiate spoofing speech from natural speech. Frame-based features are usually used in most of previous works. Although multiple frames or dynamic features are used to form a super-vector to represent the temporal information, the time span covered by these fea...

Full description

Saved in:
Bibliographic Details
Published in2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) pp. 1 - 6
Main Authors Xiaohai Tian, Xiong Xiao, Eng Siong Chng, Haizhou Li
Format Conference Proceeding
LanguageEnglish
Published Asia Pacific Signal and Information Processing Association 01.12.2016
Subjects
Online AccessGet full text
DOI10.1109/APSIPA.2016.7820738

Cover

Abstract Spoofing speech detection aims to differentiate spoofing speech from natural speech. Frame-based features are usually used in most of previous works. Although multiple frames or dynamic features are used to form a super-vector to represent the temporal information, the time span covered by these features are not sufficient. Most of the systems failed to detect the non-vocoder or unit selection based spoofing attacks. In this work, we propose to use a temporal convolutional neural network (CNN) based classifier for spoofing speech detection. The temporal CNN first convolves the feature trajectories with a set of filters, then extract the maximum responses of these filters within a time window using a max-pooling layer. Due to the use of max-pooling, we can extract useful information from a long temporal span without concatenating a large number of neighbouring frames, as in feedforward deep neural network (DNN). Five types of feature are employed to access the performance of proposed classifier. Experimental results on ASVspoof 2015 corpus show that the temporal CNN based classifier is effective for synthetic speech detection. Specifically, the proposed method brings a significant performance boost for the unit selection based spoofing speech detection.
AbstractList Spoofing speech detection aims to differentiate spoofing speech from natural speech. Frame-based features are usually used in most of previous works. Although multiple frames or dynamic features are used to form a super-vector to represent the temporal information, the time span covered by these features are not sufficient. Most of the systems failed to detect the non-vocoder or unit selection based spoofing attacks. In this work, we propose to use a temporal convolutional neural network (CNN) based classifier for spoofing speech detection. The temporal CNN first convolves the feature trajectories with a set of filters, then extract the maximum responses of these filters within a time window using a max-pooling layer. Due to the use of max-pooling, we can extract useful information from a long temporal span without concatenating a large number of neighbouring frames, as in feedforward deep neural network (DNN). Five types of feature are employed to access the performance of proposed classifier. Experimental results on ASVspoof 2015 corpus show that the temporal CNN based classifier is effective for synthetic speech detection. Specifically, the proposed method brings a significant performance boost for the unit selection based spoofing speech detection.
Author Xiong Xiao
Haizhou Li
Eng Siong Chng
Xiaohai Tian
Author_xml – sequence: 1
  surname: Xiaohai Tian
  fullname: Xiaohai Tian
  email: xhtian@ntu.edu.sg
  organization: Sch. of Comput. Sci. & Eng., Nanyang Technol. Univ., Singapore, Singapore
– sequence: 2
  surname: Xiong Xiao
  fullname: Xiong Xiao
  email: xiaoxiong@ntu.edu.sg
  organization: Temasek Labs., NTU, Singapore, Singapore
– sequence: 3
  surname: Eng Siong Chng
  fullname: Eng Siong Chng
  email: ASESChng@ntu.edu.sg
  organization: Sch. of Comput. Sci. & Eng., Nanyang Technol. Univ., Singapore, Singapore
– sequence: 4
  surname: Haizhou Li
  fullname: Haizhou Li
  email: eleliha@nus.edu.sg
  organization: Sch. of Comput. Sci. & Eng., Nanyang Technol. Univ., Singapore, Singapore
BookMark eNo9j11LwzAUhiPohZv-gt30D7QmTZcml2VMHQwcTK9DmpxosEtCPxz797bb9OrlPQ_PgXeGbn3wgNCC4IwQLJ6q3X6zq7IcE5aVPMcl5TdoJjgnRcl4zu_Reh9DsM5_Jl0E0F-JgR5074JPhm4693CIoVVNooP_Cc0wobF5GNpz9MfQfj-gO6uaDh6vOUcfz-v31Wu6fXvZrKpt6ghd8rSw2ii1NLXgDFurCceKci0UrzWhtrQginJEwI1g1OSGaZPrmhkr6tpaReeouPwdfFSno2oaGVt3UO1JEiynyVLFzkUlp8nyOnnUFhfNAcC_8Ud_AdkcXDg
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
ADTOC
UNPAY
DOI 10.1109/APSIPA.2016.7820738
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEL
IEEE Proceedings Order Plans (POP All) 1998-Present
Unpaywall for CDI: Periodical Content
Unpaywall
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
– sequence: 2
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
EISBN 9881476828
9789881476821
EndPage 6
ExternalDocumentID oai:dr.ntu.edu.sg:10356/89639
7820738
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ADTOC
UNPAY
ID FETCH-LOGICAL-i1358-4fcdaa5db9860ffc180a38c9a8bc13f7fe947860e8d963d2d6cd2cb6df9bbffa3
IEDL.DBID UNPAY
IngestDate Sun Oct 26 03:59:45 EDT 2025
Thu Jun 29 18:38:22 EDT 2023
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i1358-4fcdaa5db9860ffc180a38c9a8bc13f7fe947860e8d963d2d6cd2cb6df9bbffa3
OpenAccessLink https://proxy.k.utb.cz/login?url=https://dr.ntu.edu.sg/bitstream/10356/89639/1/APSIPA_CNN_ASV.pdf
PageCount 6
ParticipantIDs unpaywall_primary_10_1109_apsipa_2016_7820738
ieee_primary_7820738
PublicationCentury 2000
PublicationDate 2016-Dec.
PublicationDateYYYYMMDD 2016-12-01
PublicationDate_xml – month: 12
  year: 2016
  text: 2016-Dec.
PublicationDecade 2010
PublicationTitle 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)
PublicationTitleAbbrev APSIPA
PublicationYear 2016
Publisher Asia Pacific Signal and Information Processing Association
Publisher_xml – name: Asia Pacific Signal and Information Processing Association
Score 1.721207
Snippet Spoofing speech detection aims to differentiate spoofing speech from natural speech. Frame-based features are usually used in most of previous works. Although...
SourceID unpaywall
ieee
SourceType Open Access Repository
Publisher
StartPage 1
SubjectTerms Convolution
Feature extraction
Natural languages
Neural networks
Speech
Trajectory
Vocoders
SummonAdditionalLinks – databaseName: IEL
  dbid: RIE
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8NAEF7aXtSLSivWFzl4NOkmu93HsUhLFSqFWugt7LMWSxpsg-ivN5ukUcSDpyRk2E1mJgy7-b5vALiFnEqVx9nXGFkf00j70pHdKSdaQ8E1VI47PHki4zl-XPQXDXBXc2GMMQX4zATutPiXrzcqc1tlPaftRhFrgiZlpORqVUJCIeS9wXT2MB04tBYJKsuqZcoROMiSVHy8i_X6R_UYHYPJft4SNPIaZDsZqM9fkoz_fbAT0Pnm6XnTugKdgoZJ2mA4Szd5xiRLb5sao148bXYF3irxHMh96VVqVGvPIc6rzMuvnLJlcShw4R0wHw2f78d-1SzBX4Woz3xslRairyVnBFqrQgYFYooLJlWILLWG49xr0DCdf3M60kTpSEmiLZfSWoHOQCvZJOYceJJLRPMyHmJpsaFY5EskFSmlXQtcRGAXtJ0L4rTUw4irt-8Cv_Zyfa9YY0Aei3S7SkXsgrO3v_h7mEtw6KxKyMgVaO3eMnOdF_6dvCki_gX1HbGx
  priority: 102
  providerName: IEEE
Title Spoofing speech detection using temporal convolutional neural network
URI https://ieeexplore.ieee.org/document/7820738
https://dr.ntu.edu.sg/bitstream/10356/89639/1/APSIPA_CNN_ASV.pdf
UnpaywallVersion submittedVersion
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3PS8MwFA66HcSLyiZOdPTgtW3atGlzLGNjCo7CnMxTyc85nF3ZOkT_epOum8ObeAohOTzyXvKS8H3fA-AOkohx7WdbBEjZQeQLmxmye0SwEJASAbnhDj-O8HASPEzD6QGLX6wcfdZWHxXrmcvmpaFM0He9uVGI3ViHC3E9N0nH92mS9UajLBk_O4VQx6CJQ30bb4DmZJQmL7XAkAeJS4v1vDASQx52jDpcxUWpSqmcgpNNXtDPD7pYHGSVwRmgO3u2YJI3Z1Myh3_9kmr8j8HnoP3D6bPSfba6AEcyb4H-uFjq6Mpn1rqQkr9aQpYVNiu3DCB-ZtXKVQvLoNPrKNU9o4JZNRWGvA0mg_5Tb2jXhRXsuYfC2A4UF5SGgpEYQ6W4F0OKYk5ozLiHVKQkCSI9JGOh7Re-wFz4nGGhCGNKUXQJGvkyl1fAYoShSKd8L2AqkFFA9XOK-5wLUy4XYdgBLbPSWbHVzshqB3SAvV_5_Vj1HoEk2zosMw7bzb_-4_wb0ChXG3mrLwwl61asvm4dGd_DisVO
linkProvider Unpaywall
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1bT8IwFD5BfEBf1IARr3vw0Y1duksfiYGAAiEBEt6WXpFIxiJbjP56121MY3zwqWvadW1Pm7Nu3_cdgHsT-5RldtY5cqSOfJvrVJHdfexxbhLMTaa4w-OJN1igp6W7rMFDxYURQuTgM2Goy_xfPt-yVH0q6yhtN98JDuDQRQi5BVurlBKyTNzpTmfDaVfhtTyjrFsGTTmGRhrF5OOdbDY__Ef_BMb7JxewkVcjTajBPn-JMv63a6fQ-mbqadPKB51BTURN6M3ibbZmopW2i4VgLxoXSY64ijQFc19ppR7VRlOY83LtZTmlbZknOTK8BYt-b_440MtwCfractxAR5JxQlxOceCZUjIrMIkTMEwCyixH-lJg5GdFIuDZruM29xi3GfW4xJRKSZxzqEfbSFyARjF1_MyRW4hKJHxEskMSsxnjKgiu45ltaKopCONCESMsR98GvZrlqiw_ZZg4JPFuHZNQGWdf__LvZu6gMZiPR-FoOHm-giN1RwEguYZ68paKm-w1IKG3ufW_AGx4tP4
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3PS8MwFA66HcSLyiYqKj14bZu2adocy9iYgqUwJ_NU8nMOZ1e2DtG_3qTr5vAmnkJIDo-8l7wkfN_3ALiDJGJc-9kWKFA2inxhM0N2jwgWAlIiIDfc4ccUD8foYRJO9lj8Yunos7b-qFhNXTarDGWCvuvNHYTYjXW4ENdzk2x0nyV5L03zZPTslEIdgjYO9W28BdrjNEteGoEhDxKXlqtZaSSGPOwYdbiai1KXUjkGR-uipJ8fdD7fyyqDE0C39mzAJG_OumIO__ol1fgfg09B94fTZ2W7bHUGDmTRAf1RudDRVUytVSklf7WErGpsVmEZQPzUapSr5pZBpzdRqntGBbNuagx5F4wH_afe0G4KK9gzLwhjGykuKA0FIzGGSnEvhjSIOaEx416gIiUJivSQjIW2X_gCc-FzhoUijClFg3PQKhaFvAAWIyyIdMr3EFNIRojq5xT3ORemXG6A4SXomJXOy412Rt444BLYu5XfjdXvEUjyjcNy47Dt_Ks_zr8GrWq5ljf6wlCx2yYmvgEpC8RN
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2016+Asia-Pacific+Signal+and+Information+Processing+Association+Annual+Summit+and+Conference+%28APSIPA%29&rft.atitle=Spoofing+speech+detection+using+temporal+convolutional+neural+network&rft.au=Xiaohai+Tian&rft.au=Xiong+Xiao&rft.au=Eng+Siong+Chng&rft.au=Haizhou+Li&rft.date=2016-12-01&rft.pub=Asia+Pacific+Signal+and+Information+Processing+Association&rft.spage=1&rft.epage=6&rft_id=info:doi/10.1109%2FAPSIPA.2016.7820738&rft.externalDocID=7820738