Sequence-to-Sequence Acoustic Modeling for Voice Conversion
In this paper, a neural network named sequence-to-sequence ConvErsion NeTwork (SCENT) is presented for acoustic modeling in voice conversion. At training stage, a SCENT model is estimated by aligning the feature sequences of source and target speakers implicitly using attention mechanism. At the con...
Saved in:
Published in | IEEE/ACM transactions on audio, speech, and language processing Vol. 27; no. 3; pp. 631 - 644 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
01.03.2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
ISSN | 2329-9290 2329-9304 |
DOI | 10.1109/TASLP.2019.2892235 |
Cover
Abstract | In this paper, a neural network named sequence-to-sequence ConvErsion NeTwork (SCENT) is presented for acoustic modeling in voice conversion. At training stage, a SCENT model is estimated by aligning the feature sequences of source and target speakers implicitly using attention mechanism. At the conversion stage, acoustic features and durations of source utterances are converted simultaneously using the unified acoustic model. Mel-scale spectrograms are adopted as acoustic features, which contain both excitation and vocal tract descriptions of speech signals. The bottleneck features extracted from source speech using an automatic speech recognition model are appended as an auxiliary input. A WaveNet vocoder conditioned on Mel-spectrograms is built to reconstruct waveforms from the outputs of the SCENT model. It is worth noting that our proposed method can achieve appropriate duration conversion, which is difficult in conventional methods. Experimental results show that our proposed method obtained better objective and subjective performance than the baseline methods using Gaussian mixture models and deep neural networks as acoustic models. This proposed method also outperformed our previous work, which achieved the top rank in Voice Conversion Challenge 2018. Ablation tests further confirmed the effectiveness of several components in our proposed method. |
---|---|
AbstractList | In this paper, a neural network named sequence-to-sequence ConvErsion NeTwork (SCENT) is presented for acoustic modeling in voice conversion. At training stage, a SCENT model is estimated by aligning the feature sequences of source and target speakers implicitly using attention mechanism. At the conversion stage, acoustic features and durations of source utterances are converted simultaneously using the unified acoustic model. Mel-scale spectrograms are adopted as acoustic features, which contain both excitation and vocal tract descriptions of speech signals. The bottleneck features extracted from source speech using an automatic speech recognition model are appended as an auxiliary input. A WaveNet vocoder conditioned on Mel-spectrograms is built to reconstruct waveforms from the outputs of the SCENT model. It is worth noting that our proposed method can achieve appropriate duration conversion, which is difficult in conventional methods. Experimental results show that our proposed method obtained better objective and subjective performance than the baseline methods using Gaussian mixture models and deep neural networks as acoustic models. This proposed method also outperformed our previous work, which achieved the top rank in Voice Conversion Challenge 2018. Ablation tests further confirmed the effectiveness of several components in our proposed method. |
Author | Jing-Xuan Zhang Yuan Jiang Li-Rong Dai Zhen-Hua Ling Li-Juan Liu |
Author_xml | – sequence: 1 givenname: Jing-Xuan orcidid: 0000-0003-4341-3174 surname: Zhang fullname: Zhang, Jing-Xuan – sequence: 2 givenname: Zhen-Hua orcidid: 0000-0001-7853-5273 surname: Ling fullname: Ling, Zhen-Hua – sequence: 3 givenname: Li-Juan surname: Liu fullname: Liu, Li-Juan – sequence: 4 givenname: Yuan surname: Jiang fullname: Jiang, Yuan – sequence: 5 givenname: Li-Rong surname: Dai fullname: Dai, Li-Rong |
BookMark | eNp9kE9LAzEQxYNUsNZ-Ab0seN51kmyyCZ5K8R9UFFq9hjSdlZS6qclW8Nu7ta0HD57mwbzfzOOdkl4TGiTknEJBKeir2Wg6eS4YUF0wpRnj4oj0GWc61xzK3kEzDSdkmNISAChUWldln1xP8WODjcO8DflBZyMXNqn1LnsMC1z55i2rQ8xeg-9249B8Ykw-NGfkuLarhMP9HJCX25vZ-D6fPN09jEeT3DEt2txazeuystxKULWTkiuHC45oLQNNJbVuzpTEshJWyc5aS4CFZnNh57aUNR-Qy93ddQxdwNSaZdjEpntpGK1ESYVWonOpncvFkFLE2jjf2rbL2UbrV4aC2bZlftoy27bMvq0OZX_QdfTvNn79D13sII-Iv4CSUIHg_Bt9gHdN |
CODEN | ITASD8 |
CitedBy_id | crossref_primary_10_1109_ACCESS_2020_3034253 crossref_primary_10_1007_s13735_022_00241_w crossref_primary_10_1016_j_engappai_2022_105279 crossref_primary_10_1016_j_heliyon_2023_e21625 crossref_primary_10_1109_TASLP_2022_3156757 crossref_primary_10_1186_s13636_021_00226_3 crossref_primary_10_1007_s41204_021_00148_7 crossref_primary_10_3390_app11167484 crossref_primary_10_1109_TASLP_2023_3313424 crossref_primary_10_1016_j_bspc_2022_104279 crossref_primary_10_1109_TASLP_2021_3049336 crossref_primary_10_1109_TASLP_2020_3001456 crossref_primary_10_3390_jimaging7050091 crossref_primary_10_1109_TASLP_2021_3060810 crossref_primary_10_1109_LSP_2023_3313515 crossref_primary_10_1109_TASLP_2021_3060813 crossref_primary_10_1109_TASLP_2024_3395994 crossref_primary_10_1109_TASLP_2023_3293042 crossref_primary_10_1016_j_jvoice_2023_08_027 crossref_primary_10_1016_j_specom_2021_11_006 crossref_primary_10_1109_JSTSP_2022_3193761 crossref_primary_10_1109_TAFFC_2022_3175578 crossref_primary_10_1109_TASLP_2020_3047262 crossref_primary_10_1007_s10462_022_10148_x crossref_primary_10_3390_app10082884 crossref_primary_10_3390_sym13020214 crossref_primary_10_1109_TASLP_2021_3125142 crossref_primary_10_1109_TII_2020_3009159 crossref_primary_10_1109_TASLP_2021_3066047 crossref_primary_10_3390_pr9122247 crossref_primary_10_1111_exsy_13322 crossref_primary_10_1007_s10489_024_05380_7 crossref_primary_10_1016_j_inffus_2023_101869 crossref_primary_10_1016_j_ijcce_2024_12_007 crossref_primary_10_1109_TASLP_2020_3038524 crossref_primary_10_1109_TASLP_2021_3126925 crossref_primary_10_3390_app132111988 crossref_primary_10_1109_ACCESS_2023_3344653 crossref_primary_10_1016_j_specom_2023_05_004 crossref_primary_10_1109_ACCESS_2021_3065460 crossref_primary_10_1109_ACCESS_2022_3226350 crossref_primary_10_1109_TASLP_2021_3124420 crossref_primary_10_1016_j_neunet_2022_01_003 crossref_primary_10_1142_S271755452350011X crossref_primary_10_1108_IJWIS_09_2023_0162 crossref_primary_10_1109_TASLP_2021_3076867 crossref_primary_10_1587_transinf_2019EDP7297 crossref_primary_10_1109_TASLP_2019_2960721 |
Cites_doi | 10.1109/TASL.2013.2269291 10.1016/0167-6393(89)90041-1 10.1109/ICASSP.2018.8461452 10.1109/ICASSP.2015.7178896 10.1109/ICASSP.2018.8461829 10.1162/neco.1997.9.8.1735 10.1109/TASL.2007.907344 10.1016/0893-6080(89)90020-8 10.1109/ICASSP.1998.674423 10.1109/ICASSP.1998.675407 10.1109/TASL.2010.2047683 10.1109/ICASSP.2018.8462020 10.21437/Interspeech.2018-1830 10.1109/ICASSP.2018.8461878 10.1109/ICASSP.2018.8461948 10.1109/ICASSP.2018.8461368 10.1109/ICASSP.2009.4960478 10.18653/v1/D15-1166 10.1109/TASLP.2014.2353991 10.21437/Interspeech.2018-1190 10.1016/j.specom.2017.01.008 10.1109/ICASSP.1985.1168479 10.1109/ICSP.2016.7877819 10.1109/ICASSP.2016.7472621 10.1016/S0167-6393(99)00015-1 10.1587/transinf.2015EDP7457 10.3115/v1/D14-1179 10.1007/978-3-540-74048-3_4 10.1016/S0167-6393(98)00085-5 10.1109/TASLP.2014.2379589 10.1016/j.asoc.2012.05.027 10.1109/ICASSP.2014.6854321 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019 |
DBID | 97E RIA RIE AAYXX CITATION 7SC 7T9 8FD JQ2 L7M L~C L~D |
DOI | 10.1109/TASLP.2019.2892235 |
DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Linguistics and Language Behavior Abstracts (LLBA) Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest Computer Science Collection Computer and Information Systems Abstracts Linguistics and Language Behavior Abstracts (LLBA) Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Technology Research Database |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 2329-9304 |
EndPage | 644 |
ExternalDocumentID | 10_1109_TASLP_2019_2892235 8607053 |
Genre | orig-research |
GrantInformation_xml | – fundername: National Nature Science Foundation of China grantid: 61871358 – fundername: Key Science and Technology Project of Anhui Province grantid: 18030901016 – fundername: National Key R&D Program of China grantid: 2017YFB1002202 |
GroupedDBID | 0R~ 4.4 6IK 97E AAJGR AAKMM AALFJ AARMG AASAJ AAWTH AAWTV ABAZT ABQJQ ABVLG ACIWK ACM ADBCU AEBYY AEFXT AEJOY AENSD AFWIH AFWXC AGQYO AGSQL AHBIQ AIKLT AKJIK AKQYR AKRVB ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CCLIF EBS EJD GUFHI HGAVV IFIPE IPLJI JAVBF LHSKQ M43 OCL PQQKQ RIA RIE RNS ROL AAYXX CITATION 7SC 7T9 8FD JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c295t-aa93f47a3a608fc6638ced3eeaa209161acb286e475a863f4f600d92b5aba46f3 |
IEDL.DBID | RIE |
ISSN | 2329-9290 |
IngestDate | Mon Jun 30 05:05:18 EDT 2025 Wed Oct 01 02:10:32 EDT 2025 Thu Apr 24 22:57:32 EDT 2025 Wed Aug 27 02:54:09 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 3 |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c295t-aa93f47a3a608fc6638ced3eeaa209161acb286e475a863f4f600d92b5aba46f3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0003-4341-3174 0000-0001-7853-5273 |
PQID | 2175415985 |
PQPubID | 85426 |
PageCount | 14 |
ParticipantIDs | crossref_citationtrail_10_1109_TASLP_2019_2892235 crossref_primary_10_1109_TASLP_2019_2892235 proquest_journals_2175415985 ieee_primary_8607053 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2019-03-01 |
PublicationDateYYYYMMDD | 2019-03-01 |
PublicationDate_xml | – month: 03 year: 2019 text: 2019-03-01 day: 01 |
PublicationDecade | 2010 |
PublicationPlace | Piscataway |
PublicationPlace_xml | – name: Piscataway |
PublicationTitle | IEEE/ACM transactions on audio, speech, and language processing |
PublicationTitleAbbrev | TASLP |
PublicationYear | 2019 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref13 ref56 ref12 ref15 ref14 taigman (ref30) 0 ref53 ref11 ref10 ramos (ref31) 2016 ref16 ref18 kobayashi (ref35) 0 kingma (ref52) 2015 titterington (ref9) 1985 ref45 sutskever (ref17) 0 ref47 ref42 den oord (ref34) 0 ref41 wu (ref55) 0 xu (ref49) 0 ref8 ref7 ref4 ref3 ref6 ref5 kominek (ref50) 2003 ref40 vaswani (ref44) 0 nachmani (ref29) 0 ref37 krueger (ref51) 0 ref36 ba (ref43) 2016 ref2 ref1 ref39 christopher (ref48) 2016 ref38 jia (ref28) 0 ref25 ref20 ohtani (ref54) 0 ref22 ref21 bishop (ref46) 1994 ref27 bahdanau (ref19) 0 kaneko (ref32) 0 wang (ref24) 0 arik (ref23) 0 miyoshi (ref33) 0 ping (ref26) 0 |
References_xml | – ident: ref56 doi: 10.1109/TASL.2013.2269291 – year: 2016 ident: ref43 article-title: Layer normalization publication-title: arXiv 1607 06450 – start-page: 4485 year: 0 ident: ref28 article-title: Transfer learning from speaker verification to multispeaker text-to-speech synthesis publication-title: Proc Adv Neural Inf Process Syst – ident: ref2 doi: 10.1016/0167-6393(89)90041-1 – ident: ref39 doi: 10.1109/ICASSP.2018.8461452 – year: 2003 ident: ref50 article-title: CMU ARCTIC databases for speech synthesis – ident: ref13 doi: 10.1109/ICASSP.2015.7178896 – year: 1994 ident: ref46 article-title: Mixture density networks – start-page: 10 040 year: 0 ident: ref23 article-title: Neural voice cloning with a few samples publication-title: Proc Adv Neural Inf Process Syst – year: 0 ident: ref26 article-title: Deep Voice 3: 2000-speaker neural text-to-speech publication-title: Proc Int Conf Learn Represent – year: 2015 ident: ref52 article-title: Adam: A method for stochastic optimization publication-title: Proc Int Conf Learn Represent – ident: ref27 doi: 10.1109/ICASSP.2018.8461829 – ident: ref41 doi: 10.1162/neco.1997.9.8.1735 – ident: ref6 doi: 10.1109/TASL.2007.907344 – ident: ref10 doi: 10.1016/0893-6080(89)90020-8 – ident: ref3 doi: 10.1109/ICASSP.1998.674423 – start-page: 3683 year: 0 ident: ref29 article-title: Fitting new speakers based on a short untranscribed sample publication-title: Proc Int Conf Mach Learn – start-page: 2266 year: 0 ident: ref54 article-title: Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation publication-title: Proc Int Conf Spoken Lang Process – ident: ref53 doi: 10.1109/ICASSP.1998.675407 – ident: ref8 doi: 10.1109/TASL.2010.2047683 – ident: ref45 doi: 10.1109/ICASSP.2018.8462020 – year: 2016 ident: ref31 article-title: Voice conversion with deep learning – ident: ref22 doi: 10.21437/Interspeech.2018-1830 – year: 1985 ident: ref9 publication-title: Statistical Analysis of Finite Mixture Distributions – ident: ref40 doi: 10.1109/ICASSP.2018.8461878 – year: 0 ident: ref55 article-title: Merlin: An open source neural network speech synthesis system publication-title: Proc ISCA Speech Synthesis Workshop – ident: ref36 doi: 10.1109/ICASSP.2018.8461948 – year: 2016 ident: ref48 publication-title: Pattern Recognition and Machine Learning – start-page: 125 year: 0 ident: ref34 article-title: WaveNet: A generative model for raw audio publication-title: Proc ISCA Speech Synthesis Workshop – ident: ref25 doi: 10.1109/ICASSP.2018.8461368 – start-page: 1138 year: 0 ident: ref35 article-title: Statistical voice conversion with WaveNet-based waveform generation publication-title: Proc Annu Conf Int Speech Commun Assoc – ident: ref7 doi: 10.1109/ICASSP.2009.4960478 – ident: ref20 doi: 10.18653/v1/D15-1166 – ident: ref12 doi: 10.1109/TASLP.2014.2353991 – year: 0 ident: ref51 article-title: Zoneout: Regularizing RNNs by randomly preserving hidden activations publication-title: Proc Int Conf Learn Represent – start-page: 1268 year: 0 ident: ref33 article-title: Voice conversion using sequence-to-sequence learning of context posterior probabilities publication-title: Proc Annu Conf Int Speech Commun Assoc – start-page: 4006 year: 0 ident: ref24 article-title: Tacotron: Towards end-to-end speech synthesis publication-title: Proc Annu Conf Int Speech Commun Assoc – start-page: 3104 year: 0 ident: ref17 article-title: Sequence to sequence learning with neural networks publication-title: Proc 27th Int Conf Neural Inf Process Syst – start-page: 1283 year: 0 ident: ref32 article-title: Sequence-to-sequence voice conversion with similarity metric learned using generative adversarial networks publication-title: Proc Annu Conf Int Speech Commun Assoc – ident: ref21 doi: 10.21437/Interspeech.2018-1190 – ident: ref16 doi: 10.1016/j.specom.2017.01.008 – ident: ref1 doi: 10.1109/ICASSP.1985.1168479 – year: 0 ident: ref30 article-title: VoiceLoop: Voice fitting and synthesis via a phonological loop publication-title: Proc Int Conf Learn Represent – ident: ref15 doi: 10.1109/ICSP.2016.7877819 – ident: ref42 doi: 10.1109/ICASSP.2016.7472621 – start-page: 6000 year: 0 ident: ref44 article-title: Attention is all you need publication-title: Proc Adv Neural Inf Process Syst – ident: ref4 doi: 10.1016/S0167-6393(99)00015-1 – year: 0 ident: ref19 article-title: Neural machine translation by jointly learning to align and translate publication-title: Proc Int Conf Learn Represent – ident: ref37 doi: 10.1587/transinf.2015EDP7457 – ident: ref18 doi: 10.3115/v1/D14-1179 – ident: ref5 doi: 10.1007/978-3-540-74048-3_4 – ident: ref38 doi: 10.1016/S0167-6393(98)00085-5 – ident: ref14 doi: 10.1109/TASLP.2014.2379589 – ident: ref11 doi: 10.1016/j.asoc.2012.05.027 – year: 0 ident: ref49 article-title: Empirical evaluation of recitified acitvations in convolutional network publication-title: Proc ICML Deep Learn Workshop – ident: ref47 doi: 10.1109/ICASSP.2014.6854321 |
SSID | ssj0001079974 |
Score | 2.4924452 |
Snippet | In this paper, a neural network named sequence-to-sequence ConvErsion NeTwork (SCENT) is presented for acoustic modeling in voice conversion. At training... |
SourceID | proquest crossref ieee |
SourceType | Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 631 |
SubjectTerms | Ablation Acoustics Artificial neural networks attention Automatic speech recognition Cloning Conditioning Conversion Decoding Feature extraction Linguistics Mel-spectrogram Modelling Neural networks Probabilistic models sequence-to-sequence Spectrograms Speech processing Speech recognition Vocal tract Vocoders Voice conversion Voice recognition Waveforms |
Title | Sequence-to-Sequence Acoustic Modeling for Voice Conversion |
URI | https://ieeexplore.ieee.org/document/8607053 https://www.proquest.com/docview/2175415985 |
Volume | 27 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 2329-9304 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0001079974 issn: 2329-9290 databaseCode: RIE dateStart: 20140101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED61nWDgVRCFgjKwgVMnduxYTFVFVSFASG1Rt8hxnAXUIEgXfj22k5SnEJsHO7LubN93l-_uAM6EYpmgQYyMP00R5SRAsVAhShmWlGcS55lj-d6xyZxeL6JFCy7WuTBaa0c-074dun_5WaFWNlQ2iJk5oBFpQ5tzUeVqfcRTMBfCFV02GEEgY_VxkyODxWA2nN7cWyKX8I2HYUxi9MUOucYqP15jZ2LG23DbbK5iljz6qzL11du3uo3_3f0ObNVY0xtWh2MXWnq5B5ufKhB24XJaU6lRWaBm7A1V4Zp8ebZTms1X9wy09R4K86h4I0tTdzG2fZiPr2ajCar7KSAViqhEUgqSUy6JZDjOlcEasdIZ0VrK0MAGFkiVhjHTlEcyZmZqbtBQJsI0kqmkLCcH0FkWS30InlYk1DKkGaGYamJWc52HKcEZ45gz0YOgkW6i6mLjtufFU-KcDiwSp5HEaiSpNdKD8_Wa56rUxp-zu1bE65m1dHvQb5SY1LfxNTFuV2SAioijo99XHcOG_XbFLetDp3xZ6RMDNsr01J2yd8YWziY |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07b9swED646ZBm6Mst6tRtNXRr6FDiS0Qmw6jhNrZRwHbhTaAoaklhFYm85NfnSElO-kDRjQMJEXck77vTd3cAH7WVheZxStCf5oQrFpNU24TkkhquCkPLIrB8l3K24V-3YtuDs0MujHMukM_cyA_Dv_yisnsfKjtPJR5QwR7BY4FehWqyte4jKlRpHcouI0rQBO0-7bJkqD5fj1fzb57KpUfoY6BRFL9YotBa5Y_3OBiZ6TNYdNtruCVXo32dj-ztb5Ub_3f_z-FpizajcXM8XkDP7V7CyYMahH24WLVkalJXpBtHY1uFNl-R75XmM9YjBLfR9wqflWjiieohyvYKNtPP68mMtB0ViE20qIkxmpVcGWYkTUuLaCO1rmDOGZMgcJCxsXmSSseVMKnEqSXioUInuTC54bJkr-FoV-3cG4icZYkzCS8Yp9wxXK1cmeSMFlJRJfUA4k66mW3LjfuuFz-y4HZQnQWNZF4jWauRAXw6rPnZFNv45-y-F_FhZivdAQw7JWbtfbzJ0PESCFV0Kk7_vuoDHM_Wi3k2_7K8fAtP_HcaptkQjurrvXuH0KPO34cTdwfIU9F3 |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Sequence-to-Sequence+Acoustic+Modeling+for+Voice+Conversion&rft.jtitle=IEEE%2FACM+transactions+on+audio%2C+speech%2C+and+language+processing&rft.au=Jing-Xuan%2C+Zhang&rft.au=Zhen-Hua%2C+Ling&rft.au=Li-Juan%2C+Liu&rft.au=Jiang%2C+Yuan&rft.date=2019-03-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=2329-9290&rft.eissn=2329-9304&rft.volume=27&rft.issue=3&rft.spage=631&rft_id=info:doi/10.1109%2FTASLP.2019.2892235&rft.externalDBID=NO_FULL_TEXT |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2329-9290&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2329-9290&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2329-9290&client=summon |