Microphone Array Speech Separation Algorithm Based on TC-ResNet

Traditional separation methods have limited ability to handle the speech separation problem in high reverberant and low signal-to-noise ratio (SNR) environments, and thus achieve unsatisfactory results. In this study, a convolutional neural network with temporal convolution and residual network (TC-...

Full description

Saved in:
Bibliographic Details
Published inComputers, materials & continua Vol. 69; no. 2; pp. 2705 - 2716
Main Authors Zhou, Lin, Xu, Yue, Wang, Tianyi, Feng, Kun, Shi, Jingang
Format Journal Article
LanguageEnglish
Published Henderson Tech Science Press 2021
Subjects
Online AccessGet full text
ISSN1546-2226
1546-2218
1546-2226
DOI10.32604/cmc.2021.017080

Cover

Abstract Traditional separation methods have limited ability to handle the speech separation problem in high reverberant and low signal-to-noise ratio (SNR) environments, and thus achieve unsatisfactory results. In this study, a convolutional neural network with temporal convolution and residual network (TC-ResNet) is proposed to realize speech separation in a complex acoustic environment. A simplified steered-response power phase transform, denoted as GSRP-PHAT, is employed to reduce the computational cost. The extracted features are reshaped to a special tensor as the system inputs and implements temporal convolution, which not only enlarges the receptive field of the convolution layer but also significantly reduces the network computational cost. Residual blocks are used to combine multiresolution features and accelerate the training procedure. A modified ideal ratio mask is applied as the training target. Simulation results demonstrate that the proposed microphone array speech separation algorithm based on TC-ResNet achieves a better performance in terms of distortion ratio, source-to-interference ratio, and short-time objective intelligibility in low SNR and high reverberant environments, particularly in untrained situations. This indicates that the proposed method has generalization to untrained conditions.
AbstractList Traditional separation methods have limited ability to handle the speech separation problem in high reverberant and low signal-to-noise ratio (SNR) environments, and thus achieve unsatisfactory results. In this study, a convolutional neural network with temporal convolution and residual network (TC-ResNet) is proposed to realize speech separation in a complex acoustic environment. A simplified steered-response power phase transform, denoted as GSRP-PHAT, is employed to reduce the computational cost. The extracted features are reshaped to a special tensor as the system inputs and implements temporal convolution, which not only enlarges the receptive field of the convolution layer but also significantly reduces the network computational cost. Residual blocks are used to combine multiresolution features and accelerate the training procedure. A modified ideal ratio mask is applied as the training target. Simulation results demonstrate that the proposed microphone array speech separation algorithm based on TC-ResNet achieves a better performance in terms of distortion ratio, source-to-interference ratio, and short-time objective intelligibility in low SNR and high reverberant environments, particularly in untrained situations. This indicates that the proposed method has generalization to untrained conditions.
Author Feng, Kun
Shi, Jingang
Wang, Tianyi
Zhou, Lin
Xu, Yue
Author_xml – sequence: 1
  givenname: Lin
  surname: Zhou
  fullname: Zhou, Lin
– sequence: 2
  givenname: Yue
  surname: Xu
  fullname: Xu, Yue
– sequence: 3
  givenname: Tianyi
  surname: Wang
  fullname: Wang, Tianyi
– sequence: 4
  givenname: Kun
  surname: Feng
  fullname: Feng, Kun
– sequence: 5
  givenname: Jingang
  surname: Shi
  fullname: Shi, Jingang
BookMark eNqNkE1LAzEQhoNUsK3ePS543pqPTbY5SS1-QVWw9Rxmk9Ru2e6uSRbpvzd2PYggeJpheJ-X4RmhQd3UFqFzgieMCpxd6p2eUEzJBJMcT_ERGhKeiZRSKgY_9hM08n6LMRNM4iG6eiy1a9pNLEtmzsE-WbbW6k2ytC04CGVTJ7PqrXFl2OySa_DWJPG0mqcv1j_ZcIqO11B5e_Y9x-j19mY1v08Xz3cP89ki1YywkFKpuZBC0pxYJqXEhbGgBQUoDDVFwYFNNWPEAAGDi7ywnMQNDOec5WvKxoj0vV3dwv4Dqkq1rtyB2yuC1cGAigbUlwHVG4jMRc-0rnnvrA9q23Sujm8qynlOMpZNWUzhPhU9eO_s-j_F4heiy3BwFRyU1d_gJ7mxfTY
CitedBy_id crossref_primary_10_32604_iasc_2023_030180
crossref_primary_10_32604_iasc_2023_035051
Cites_doi 10.32604/cmc.2020.09848
10.1109/TASSP.1976.1162830
10.32604/cmc.2020.010182
10.1109/TASLP.2019.2915167
10.1109/LSP.2007.910324
10.1121/1.382599
10.32604/jnm.2020.09356
10.1109/TASLP.2017.2726762
10.1109/TSA.2005.848875
10.1109/TCAD.2020.3012320
10.32604/cmc.2020.09964
ContentType Journal Article
Copyright 2021. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2021. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID AAYXX
CITATION
7SC
7SR
8BQ
8FD
ABUWG
AFKRA
AZQEC
BENPR
CCPQU
DWQXO
JG9
JQ2
L7M
L~C
L~D
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQQKQ
PQUKI
PRINS
ADTOC
UNPAY
DOI 10.32604/cmc.2021.017080
DatabaseName CrossRef
Computer and Information Systems Abstracts
Engineered Materials Abstracts
METADEX
Technology Research Database
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
ProQuest Central Essentials
ProQuest Central
ProQuest One Community College
ProQuest Central Korea
Materials Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
ProQuest Central Premium
ProQuest One Academic
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
Unpaywall for CDI: Periodical Content
Unpaywall
DatabaseTitle CrossRef
Publicly Available Content Database
Materials Research Database
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
ProQuest Central Essentials
ProQuest One Academic Eastern Edition
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest Central China
METADEX
Computer and Information Systems Abstracts Professional
ProQuest Central
Engineered Materials Abstracts
ProQuest One Academic UKI Edition
ProQuest Central Korea
ProQuest Central (New)
ProQuest One Academic
Advanced Technologies Database with Aerospace
ProQuest One Academic (New)
DatabaseTitleList Publicly Available Content Database
Database_xml – sequence: 1
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
– sequence: 2
  dbid: BENPR
  name: ProQuest Central
  url: http://www.proquest.com/pqcentral?accountid=15518
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1546-2226
EndPage 2716
ExternalDocumentID 10.32604/cmc.2021.017080
10_32604_cmc_2021_017080
GroupedDBID AAFWJ
AAYXX
ACIWK
ADMLS
AFKRA
ALMA_UNASSIGNED_HOLDINGS
BENPR
CCPQU
CITATION
EBS
EJD
J9A
OK1
P2P
PHGZM
PHGZT
PIMPY
PUEGO
RTS
TUS
7SC
7SR
8BQ
8FD
ABUWG
AZQEC
DWQXO
JG9
JQ2
L7M
L~C
L~D
PKEHL
PQEST
PQQKQ
PQUKI
PRINS
ADTOC
UNPAY
ID FETCH-LOGICAL-c313t-29c56969271e39990bdeac62aabd2dbb5a38c331da1ad0b7be511adad55537f23
IEDL.DBID UNPAY
ISSN 1546-2226
1546-2218
IngestDate Sun Oct 26 04:15:13 EDT 2025
Sun Jun 29 16:01:29 EDT 2025
Thu Apr 24 23:02:01 EDT 2025
Wed Oct 01 02:38:57 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 2
Language English
License cc-by
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c313t-29c56969271e39990bdeac62aabd2dbb5a38c331da1ad0b7be511adad55537f23
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
OpenAccessLink https://proxy.k.utb.cz/login?url=https://file.techscience.com/ueditor/files/cmc/TSP_CMC_69-2/TSP_CMC_17080/TSP_CMC_17080.pdf
PQID 2557143483
PQPubID 2048737
PageCount 12
ParticipantIDs unpaywall_primary_10_32604_cmc_2021_017080
proquest_journals_2557143483
crossref_primary_10_32604_cmc_2021_017080
crossref_citationtrail_10_32604_cmc_2021_017080
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2021-00-00
20210101
PublicationDateYYYYMMDD 2021-01-01
PublicationDate_xml – year: 2021
  text: 2021-00-00
PublicationDecade 2020
PublicationPlace Henderson
PublicationPlace_xml – name: Henderson
PublicationTitle Computers, materials & continua
PublicationYear 2021
Publisher Tech Science Press
Publisher_xml – name: Tech Science Press
References Gu (ref12) 2020
Nannuru (ref22) 2019
Pfeifenberger (ref2) 2015
Wang (ref3) 2012
Park (ref1) 2020; 26
Yang (ref15) 2020; 2
Knapp (ref6) 1976; 24
ref10
Liu (ref5) 2020; 64
Zhao (ref7) 2020; 64
Wang (ref18) 2016
Zhou (ref4) 2020; 63
Bernardo (ref13) 2020
Yin (ref19) 2020
Han (ref16) 2019
Roux (ref17) 2019
Pariente (ref21) 2020
Choi (ref14) 2019
Alien (ref23) 1979; 65
Gillette (ref9) 2008; 15
Luo (ref11) 2019; 27
Silverman (ref8) 2005; 13
Firoozabadi (ref20) 2019
References_xml – start-page: 9458
  year: 2020
  ident: ref19
  publication-title: Advancement of Artificial Intelligence
– start-page: 452
  year: 2015
  ident: ref2
  article-title: Multi-channel speech processing architectures for noise robust speech recognition: 3rd chime challenge results
– volume: 64
  start-page: 253
  year: 2020
  ident: ref7
  article-title: Sound source localization based on srp-phat spatial spectrum and deep neural network
  publication-title: Computers, Materials & Continua
  doi: 10.32604/cmc.2020.09848
– start-page: 4355
  year: 2019
  ident: ref22
  article-title: 2D beamforming on sparse arrays with sparse Bayesian learning
– start-page: 3372
  year: 2019
  ident: ref14
  article-title: Temporal convolution for real-time keyword spotting on mobile devices
– start-page: 626
  year: 2019
  ident: ref17
  article-title: SDR–half-baked or well done?
– volume: 24
  start-page: 320
  year: 1976
  ident: ref6
  article-title: The generalized correlation method for estimation of time delay
  publication-title: IEEE Transactions on Acoustics, Speech, and Signal Processing
  doi: 10.1109/TASSP.1976.1162830
– volume: 26
  start-page: 149
  year: 2020
  ident: ref1
  article-title: Noise cancellation based on voice activity detection using spectral variation for speech recognition in smart home devices
  publication-title: Intelligent Automation & Soft Computing
– start-page: 7319
  year: 2020
  ident: ref12
  article-title: Enhancing end-to-end multi-channel speech separation via spatial feature learning
– volume: 63
  start-page: 1373
  year: 2020
  ident: ref4
  article-title: Binaural speech separation algorithm based on long and short time memory networks
  publication-title: Computers, Materials & Continua
  doi: 10.32604/cmc.2020.010182
– volume: 27
  start-page: 1256
  year: 2019
  ident: ref11
  article-title: Conv-tasNet: Surpassing ideal time–Frequency magnitude masking for speech separation
  publication-title: IEEE/ACM Transactions on Audio, Speech, and Language Processing
  doi: 10.1109/TASLP.2019.2915167
– volume: 15
  start-page: 1
  year: 2008
  ident: ref9
  article-title: A linear closed-form algorithm for source localization from time-differences of arrival
  publication-title: IEEE Signal Processing Letters
  doi: 10.1109/LSP.2007.910324
– start-page: 361
  year: 2019
  ident: ref16
  article-title: Online deep attractor network for real-time single-channel speech separation
– start-page: 1528
  year: 2012
  ident: ref3
  article-title: Boosting classification based speech separation using temporal dynamics
– volume: 65
  start-page: 943
  year: 1979
  ident: ref23
  article-title: Image method for efficiently simulating small-room acoustics
  publication-title: The Journal of the Acoustical Society of America
  doi: 10.1121/1.382599
– volume: 2
  start-page: 1
  year: 2020
  ident: ref15
  article-title: Mixed noise removal by residual learning of deep cnn
  publication-title: Journal of New Media
  doi: 10.32604/jnm.2020.09356
– ident: ref10
  doi: 10.1109/TASLP.2017.2726762
– volume: 13
  start-page: 593
  year: 2005
  ident: ref8
  article-title: Performance of real-time source-location estimators for a large-aperture microphone array
  publication-title: IEEE Transactions on Speech and Audio Processing
  doi: 10.1109/TSA.2005.848875
– start-page: 4240
  year: 2020
  ident: ref13
  article-title: Ultratrail: A configurable ultra-low power TC-resNet AI accelerator for efficient keyword spotting
  publication-title: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
  doi: 10.1109/TCAD.2020.3012320
– start-page: 1
  year: 2016
  ident: ref18
  article-title: Oracle performance investigation of the ideal masks
– start-page: 6364
  year: 2020
  ident: ref21
  article-title: Filterbank design for End-to-end speech separation
– volume: 64
  start-page: 589
  year: 2020
  ident: ref5
  article-title: Generalized array architecture with multiple sub-arrays and hole-repair algorithm for doa estimation
  publication-title: Computers, Materials & Continua
  doi: 10.32604/cmc.2020.09964
– start-page: 208
  year: 2019
  ident: ref20
  publication-title: Signal Processing: Algorithms, Architectures, Arrangements, and Applications,
SSID ssj0036390
Score 2.2059689
Snippet Traditional separation methods have limited ability to handle the speech separation problem in high reverberant and low signal-to-noise ratio (SNR)...
SourceID unpaywall
proquest
crossref
SourceType Open Access Repository
Aggregation Database
Enrichment Source
Index Database
StartPage 2705
SubjectTerms Algorithms
Arrays
Artificial neural networks
Computational efficiency
Computing costs
Deep learning
Feature extraction
Intelligibility
Neural networks
Separation
Signal processing
Signal to noise ratio
Simulation
Speech
Tensors
Training
SummonAdditionalLinks – databaseName: ProQuest Central
  dbid: BENPR
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwhV07T8MwED6VMsDCG1FeysACkmlsx0k9INQiKoREhXhIbJFfpUMpBYIQ_567JgEm2KLIsZK7-O6zz_4-gAPljLBOJMxlXLOErjouS5mmWvtQSM09HRS-GqQX98nlg3powKA-C0PbKuuYOAvU_tnRGnkboS9JdScdeTp9YaQaRdXVWkLDVNIK_mRGMTYH84KYsZow3zsfXN_UsVliPp4dkVRJygRmt7JwiRAmTtruiSgNBT8mShmiifydqH7Q58L7ZGo-P8x4_CsR9VdgqUKQUbd0-So0wmQNlmt1hqgarOtwekV77WjrecDGr-Yzup2G4EbRbSj5vp8nUXf8iN9YjJ6iHmYzH-GtuzN2E94GodiA-_753dkFq-QSmJNcFkxop1KdapHxgLBDx9ZjVE2FMdYLb60ysuOk5N5w42Ob2YBgy3jjlVIyQ79sQnOC77QFUYzzPh2rmFtvEj7UWols6Dvp0CGe8iq0oF3bJncVlzhJWoxznFPMrJmjNXOyZl5aswWH309MSx6NP9ru1ubOqxH1lv_4vwVH3y74t6_tv_vagUVqXK6p7EKzeH0Pe4gyCrtf_TpfSwzM6A
  priority: 102
  providerName: ProQuest
Title Microphone Array Speech Separation Algorithm Based on TC-ResNet
URI https://www.proquest.com/docview/2557143483
https://file.techscience.com/ueditor/files/cmc/TSP_CMC_69-2/TSP_CMC_17080/TSP_CMC_17080.pdf
UnpaywallVersion publishedVersion
Volume 69
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVEBS
  databaseName: Inspec with Full Text
  customDbUrl:
  eissn: 1546-2226
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0036390
  issn: 1546-2226
  databaseCode: ADMLS
  dateStart: 20150601
  isFulltext: true
  titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text
  providerName: EBSCOhost
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl: http://www.proquest.com/pqcentral?accountid=15518
  eissn: 1546-2226
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0036390
  issn: 1546-2226
  databaseCode: BENPR
  dateStart: 20040101
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3Nb9MwFH8a7YET41MUjSkHLiA5jZ3YqU-oqzZNSK2qtZWGOET-CiC6rmpToU3743lOnMI4gDhwsxPHH3kvfr8X278H8IYbxbRhGTE5lSTzqYHJBZF-rb1kqaTWHxQeT8T5IvtwyS8P4FN7FsazEcWevjQYgIYBEOdx9EDrm9u-uTL9-WxajMajQkjC9hmae47Ee7l4bcsH0BUcgXoHuovJdPixZlDNBGGs_vsX0iwsYiKcSTLfBvqOjMZJXc19o_UTiT7crdbq5rtaLn8xSmeHcNcOp9mL8i3eVTo2t78xPf6n8T6GRwHMRsNG-57AgVs9hcM2UEQU5o1n8H7st_35XfAOC2_UTTRbO-xNNHMN9fj1KhouP19vvlZfrqITNKw2wkvzEblw24mrnsPi7HQ-OichcgMxKU0rwqThQgrJcuoQAclEW5zgBVNKW2a15iodmDSlVlFlE51rh7hPWWU552mOKvICOivs00uIEnRBZcITqq3KaCklZ3lpB6I0CO0sdz3ot6IpTKA199E1lgW6N7UwC3yBhRdm0QizB2_3T6wbSo8_lD1qpV2Ej3tboBfmo8Zng7QH7_Ya8Ne6Xv1L4SPoVJude42Ip9LH0D05nUwvjoMG_wBU8_yn
linkProvider Unpaywall
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT9wwEB4hONBLS1_qttD60B5ayd3YjpP1ASGgoKWwqwoWiVvqV-CwLFs2CO2f62_rzCYBTvTELYpiy_o8mW_8mG8APmtvpfMy5T4Xhqf01PN5xg2dtZdSGREoUXgwzPqn6c8zfbYEf9tcGLpW2frEhaMOV572yLsY-lKp7rSntqZ_OFWNotPVtoSGbUorhM2FxFiT2HEY57e4hJttHvzA-f4i5f7eaLfPmyoD3CuhKi6N15nJjMxFRLY2iQvojDJprQsyOKet6nmlRLDChsTlLmKMYoMNWmuVlyR8gBSwgkMzuPhb2dkb_jpuuUAh_y9SMnWacYlsWh-UYsiUpF1_SRKKUnwnCRuSpXxIjPfR7urNZGrnt3Y8fkB8-2vwvIlY2XZtYi9hKU5ewYu2GgRrnMNr2BrQ3T666h7x42s7ZyfTGP0FO4m1vvjVhG2PzxHT6uKS7SB7BoavRrv8OM6GsXoDp08C3FtYnuCY3gFLcJ1pEp0IF2wqSmO0zMvQy0qP8VvQsQPdFpvCN9rlVEJjXOAaZoFmgWgWhGZRo9mBr3ctprVuxyPfrrdwF80fPCvu7a0D3-6m4L99vX-8r0-w2h8Njoqjg-HhB3hGDev9nHVYrq5v4gZGOJX72JgRg99Pbbn_AIlXCv8
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3Nb9MwFH8a3YET41MUDZQDF5Ccxk7s1CdUKqYJqdVEW2mIg-WvAKLrqjYV2rQ_nufEKYwDiAM3O3H8kffi93ux_XsAL7nVzFhWEFtSSYqQGtpSEBnW2iuWS-rCQeHJVJwuivfn_PwAPnVnYQIbURroS6MBaBkAcR5HD7S5uR3YCzuYz87UeDJWQhK2z9AycCTeyqVrV92BQ8ERqPfgcDE9G31sGFQLQRhr_v7FNIuLmAhnsiK0gb4jo2nWVHPbaP1Eond3q7W--q6Xy1-M0skR3HTDafeifEt3tUnt9W9Mj_9pvPfhXgSzyajVvgdw4FcP4agLFJHEeeMRvJmEbX9hF7zHwht9lczWHnuTzHxLPX65SkbLz5ebr_WXi-QtGlaX4KX5mHzw26mvH8Pi5N18fEpi5AZic5rXhEnLhRSSldQjApKZcTjBC6a1ccwZw3U-tHlOnabaZaY0HnGfdtpxzvMSVeQJ9FbYp6eQZOiCyoxn1Dhd0EpKzsrKDUVlEdo57vsw6ESjbKQ1D9E1lgrdm0aYCl-gCsJUrTD78Gr_xLql9PhD2eNO2ip-3FuFXliIGl8M8z683mvAX-t69i-Fj6FXb3b-OSKe2ryImvsDd3v7DQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Microphone+Array+Speech+Separation+Algorithm+Based+on+TC-ResNet&rft.jtitle=Computers%2C+materials+%26+continua&rft.au=Zhou%2C+Lin&rft.au=Xu%2C+Yue&rft.au=Wang%2C+Tianyi&rft.au=Feng%2C+Kun&rft.date=2021&rft.issn=1546-2226&rft.volume=69&rft.issue=2&rft.spage=2705&rft.epage=2716&rft_id=info:doi/10.32604%2Fcmc.2021.017080&rft.externalDBID=n%2Fa&rft.externalDocID=10_32604_cmc_2021_017080
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1546-2226&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1546-2226&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1546-2226&client=summon