Sequence-to-Sequence Acoustic Modeling for Voice Conversion

In this paper, a neural network named sequence-to-sequence ConvErsion NeTwork (SCENT) is presented for acoustic modeling in voice conversion. At training stage, a SCENT model is estimated by aligning the feature sequences of source and target speakers implicitly using attention mechanism. At the con...

Full description

Saved in:

Bibliographic Details
Published in	IEEE/ACM transactions on audio, speech, and language processing Vol. 27; no. 3; pp. 631 - 644
Main Authors	Zhang, Jing-Xuan, Ling, Zhen-Hua, Liu, Li-Juan, Jiang, Yuan, Dai, Li-Rong
Format	Journal Article
Language	English
Published	Piscataway IEEE 01.03.2019 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Ablation Acoustics Artificial neural networks attention Automatic speech recognition Cloning Conditioning Conversion Decoding Feature extraction Linguistics Mel-spectrogram Modelling Neural networks Probabilistic models sequence-to-sequence Spectrograms Speech processing Speech recognition Vocal tract Vocoders Voice conversion Voice recognition Waveforms
Online Access	Get full text
ISSN	2329-9290 2329-9304
DOI	10.1109/TASLP.2019.2892235

Cover

Abstract	In this paper, a neural network named sequence-to-sequence ConvErsion NeTwork (SCENT) is presented for acoustic modeling in voice conversion. At training stage, a SCENT model is estimated by aligning the feature sequences of source and target speakers implicitly using attention mechanism. At the conversion stage, acoustic features and durations of source utterances are converted simultaneously using the unified acoustic model. Mel-scale spectrograms are adopted as acoustic features, which contain both excitation and vocal tract descriptions of speech signals. The bottleneck features extracted from source speech using an automatic speech recognition model are appended as an auxiliary input. A WaveNet vocoder conditioned on Mel-spectrograms is built to reconstruct waveforms from the outputs of the SCENT model. It is worth noting that our proposed method can achieve appropriate duration conversion, which is difficult in conventional methods. Experimental results show that our proposed method obtained better objective and subjective performance than the baseline methods using Gaussian mixture models and deep neural networks as acoustic models. This proposed method also outperformed our previous work, which achieved the top rank in Voice Conversion Challenge 2018. Ablation tests further confirmed the effectiveness of several components in our proposed method.
AbstractList	In this paper, a neural network named sequence-to-sequence ConvErsion NeTwork (SCENT) is presented for acoustic modeling in voice conversion. At training stage, a SCENT model is estimated by aligning the feature sequences of source and target speakers implicitly using attention mechanism. At the conversion stage, acoustic features and durations of source utterances are converted simultaneously using the unified acoustic model. Mel-scale spectrograms are adopted as acoustic features, which contain both excitation and vocal tract descriptions of speech signals. The bottleneck features extracted from source speech using an automatic speech recognition model are appended as an auxiliary input. A WaveNet vocoder conditioned on Mel-spectrograms is built to reconstruct waveforms from the outputs of the SCENT model. It is worth noting that our proposed method can achieve appropriate duration conversion, which is difficult in conventional methods. Experimental results show that our proposed method obtained better objective and subjective performance than the baseline methods using Gaussian mixture models and deep neural networks as acoustic models. This proposed method also outperformed our previous work, which achieved the top rank in Voice Conversion Challenge 2018. Ablation tests further confirmed the effectiveness of several components in our proposed method.
Author	Jing-Xuan Zhang Yuan Jiang Li-Rong Dai Zhen-Hua Ling Li-Juan Liu
Author_xml	– sequence: 1 givenname: Jing-Xuan orcidid: 0000-0003-4341-3174 surname: Zhang fullname: Zhang, Jing-Xuan – sequence: 2 givenname: Zhen-Hua orcidid: 0000-0001-7853-5273 surname: Ling fullname: Ling, Zhen-Hua – sequence: 3 givenname: Li-Juan surname: Liu fullname: Liu, Li-Juan – sequence: 4 givenname: Yuan surname: Jiang fullname: Jiang, Yuan – sequence: 5 givenname: Li-Rong surname: Dai fullname: Dai, Li-Rong
BookMark	eNp9kE9LAzEQxYNUsNZ-Ab0seN51kmyyCZ5K8R9UFFq9hjSdlZS6qclW8Nu7ta0HD57mwbzfzOOdkl4TGiTknEJBKeir2Wg6eS4YUF0wpRnj4oj0GWc61xzK3kEzDSdkmNISAChUWldln1xP8WODjcO8DflBZyMXNqn1LnsMC1z55i2rQ8xeg-9249B8Ykw-NGfkuLarhMP9HJCX25vZ-D6fPN09jEeT3DEt2txazeuystxKULWTkiuHC45oLQNNJbVuzpTEshJWyc5aS4CFZnNh57aUNR-Qy93ddQxdwNSaZdjEpntpGK1ESYVWonOpncvFkFLE2jjf2rbL2UbrV4aC2bZlftoy27bMvq0OZX_QdfTvNn79D13sII-Iv4CSUIHg_Bt9gHdN
CODEN	ITASD8
CitedBy_id	crossref_primary_10_1109_ACCESS_2020_3034253 crossref_primary_10_1007_s13735_022_00241_w crossref_primary_10_1016_j_engappai_2022_105279 crossref_primary_10_1016_j_heliyon_2023_e21625 crossref_primary_10_1109_TASLP_2022_3156757 crossref_primary_10_1186_s13636_021_00226_3 crossref_primary_10_1007_s41204_021_00148_7 crossref_primary_10_3390_app11167484 crossref_primary_10_1109_TASLP_2023_3313424 crossref_primary_10_1016_j_bspc_2022_104279 crossref_primary_10_1109_TASLP_2021_3049336 crossref_primary_10_1109_TASLP_2020_3001456 crossref_primary_10_3390_jimaging7050091 crossref_primary_10_1109_TASLP_2021_3060810 crossref_primary_10_1109_LSP_2023_3313515 crossref_primary_10_1109_TASLP_2021_3060813 crossref_primary_10_1109_TASLP_2024_3395994 crossref_primary_10_1109_TASLP_2023_3293042 crossref_primary_10_1016_j_jvoice_2023_08_027 crossref_primary_10_1016_j_specom_2021_11_006 crossref_primary_10_1109_JSTSP_2022_3193761 crossref_primary_10_1109_TAFFC_2022_3175578 crossref_primary_10_1109_TASLP_2020_3047262 crossref_primary_10_1007_s10462_022_10148_x crossref_primary_10_3390_app10082884 crossref_primary_10_3390_sym13020214 crossref_primary_10_1109_TASLP_2021_3125142 crossref_primary_10_1109_TII_2020_3009159 crossref_primary_10_1109_TASLP_2021_3066047 crossref_primary_10_3390_pr9122247 crossref_primary_10_1111_exsy_13322 crossref_primary_10_1007_s10489_024_05380_7 crossref_primary_10_1016_j_inffus_2023_101869 crossref_primary_10_1016_j_ijcce_2024_12_007 crossref_primary_10_1109_TASLP_2020_3038524 crossref_primary_10_1109_TASLP_2021_3126925 crossref_primary_10_3390_app132111988 crossref_primary_10_1109_ACCESS_2023_3344653 crossref_primary_10_1016_j_specom_2023_05_004 crossref_primary_10_1109_ACCESS_2021_3065460 crossref_primary_10_1109_ACCESS_2022_3226350 crossref_primary_10_1109_TASLP_2021_3124420 crossref_primary_10_1016_j_neunet_2022_01_003 crossref_primary_10_1142_S271755452350011X crossref_primary_10_1108_IJWIS_09_2023_0162 crossref_primary_10_1109_TASLP_2021_3076867 crossref_primary_10_1587_transinf_2019EDP7297 crossref_primary_10_1109_TASLP_2019_2960721
Cites_doi	10.1109/TASL.2013.2269291 10.1016/0167-6393(89)90041-1 10.1109/ICASSP.2018.8461452 10.1109/ICASSP.2015.7178896 10.1109/ICASSP.2018.8461829 10.1162/neco.1997.9.8.1735 10.1109/TASL.2007.907344 10.1016/0893-6080(89)90020-8 10.1109/ICASSP.1998.674423 10.1109/ICASSP.1998.675407 10.1109/TASL.2010.2047683 10.1109/ICASSP.2018.8462020 10.21437/Interspeech.2018-1830 10.1109/ICASSP.2018.8461878 10.1109/ICASSP.2018.8461948 10.1109/ICASSP.2018.8461368 10.1109/ICASSP.2009.4960478 10.18653/v1/D15-1166 10.1109/TASLP.2014.2353991 10.21437/Interspeech.2018-1190 10.1016/j.specom.2017.01.008 10.1109/ICASSP.1985.1168479 10.1109/ICSP.2016.7877819 10.1109/ICASSP.2016.7472621 10.1016/S0167-6393(99)00015-1 10.1587/transinf.2015EDP7457 10.3115/v1/D14-1179 10.1007/978-3-540-74048-3_4 10.1016/S0167-6393(98)00085-5 10.1109/TASLP.2014.2379589 10.1016/j.asoc.2012.05.027 10.1109/ICASSP.2014.6854321
ContentType	Journal Article
Copyright	Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019
Copyright_xml	– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019
DBID	97E RIA RIE AAYXX CITATION 7SC 7T9 8FD JQ2 L7M L~C L~D
DOI	10.1109/TASLP.2019.2892235
DatabaseName	IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Linguistics and Language Behavior Abstracts (LLBA) Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional
DatabaseTitle	CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest Computer Science Collection Computer and Information Systems Abstracts Linguistics and Language Behavior Abstracts (LLBA) Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional
DatabaseTitleList	Technology Research Database
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	2329-9304
EndPage	644
ExternalDocumentID	10_1109_TASLP_2019_2892235 8607053
Genre	orig-research
GrantInformation_xml	– fundername: National Nature Science Foundation of China grantid: 61871358 – fundername: Key Science and Technology Project of Anhui Province grantid: 18030901016 – fundername: National Key R&D Program of China grantid: 2017YFB1002202
GroupedDBID	0R~ 4.4 6IK 97E AAJGR AAKMM AALFJ AARMG AASAJ AAWTH AAWTV ABAZT ABQJQ ABVLG ACIWK ACM ADBCU AEBYY AEFXT AEJOY AENSD AFWIH AFWXC AGQYO AGSQL AHBIQ AIKLT AKJIK AKQYR AKRVB ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CCLIF EBS EJD GUFHI HGAVV IFIPE IPLJI JAVBF LHSKQ M43 OCL PQQKQ RIA RIE RNS ROL AAYXX CITATION 7SC 7T9 8FD JQ2 L7M L~C L~D
ID	FETCH-LOGICAL-c295t-aa93f47a3a608fc6638ced3eeaa209161acb286e475a863f4f600d92b5aba46f3
IEDL.DBID	RIE
ISSN	2329-9290
IngestDate	Mon Jun 30 05:05:18 EDT 2025 Wed Oct 01 02:10:32 EDT 2025 Thu Apr 24 22:57:32 EDT 2025 Wed Aug 27 02:54:09 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Issue	3
Language	English
License	https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c295t-aa93f47a3a608fc6638ced3eeaa209161acb286e475a863f4f600d92b5aba46f3
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0003-4341-3174 0000-0001-7853-5273
PQID	2175415985
PQPubID	85426
PageCount	14
ParticipantIDs	crossref_citationtrail_10_1109_TASLP_2019_2892235 crossref_primary_10_1109_TASLP_2019_2892235 proquest_journals_2175415985 ieee_primary_8607053
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2019-03-01
PublicationDateYYYYMMDD	2019-03-01
PublicationDate_xml	– month: 03 year: 2019 text: 2019-03-01 day: 01
PublicationDecade	2010
PublicationPlace	Piscataway
PublicationPlace_xml	– name: Piscataway
PublicationTitle	IEEE/ACM transactions on audio, speech, and language processing
PublicationTitleAbbrev	TASLP
PublicationYear	2019
Publisher	IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml	– name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References	ref13 ref56 ref12 ref15 ref14 taigman (ref30) 0 ref53 ref11 ref10 ramos (ref31) 2016 ref16 ref18 kobayashi (ref35) 0 kingma (ref52) 2015 titterington (ref9) 1985 ref45 sutskever (ref17) 0 ref47 ref42 den oord (ref34) 0 ref41 wu (ref55) 0 xu (ref49) 0 ref8 ref7 ref4 ref3 ref6 ref5 kominek (ref50) 2003 ref40 vaswani (ref44) 0 nachmani (ref29) 0 ref37 krueger (ref51) 0 ref36 ba (ref43) 2016 ref2 ref1 ref39 christopher (ref48) 2016 ref38 jia (ref28) 0 ref25 ref20 ohtani (ref54) 0 ref22 ref21 bishop (ref46) 1994 ref27 bahdanau (ref19) 0 kaneko (ref32) 0 wang (ref24) 0 arik (ref23) 0 miyoshi (ref33) 0 ping (ref26) 0
References_xml	– ident: ref56 doi: 10.1109/TASL.2013.2269291 – year: 2016 ident: ref43 article-title: Layer normalization publication-title: arXiv 1607 06450 – start-page: 4485 year: 0 ident: ref28 article-title: Transfer learning from speaker verification to multispeaker text-to-speech synthesis publication-title: Proc Adv Neural Inf Process Syst – ident: ref2 doi: 10.1016/0167-6393(89)90041-1 – ident: ref39 doi: 10.1109/ICASSP.2018.8461452 – year: 2003 ident: ref50 article-title: CMU ARCTIC databases for speech synthesis – ident: ref13 doi: 10.1109/ICASSP.2015.7178896 – year: 1994 ident: ref46 article-title: Mixture density networks – start-page: 10 040 year: 0 ident: ref23 article-title: Neural voice cloning with a few samples publication-title: Proc Adv Neural Inf Process Syst – year: 0 ident: ref26 article-title: Deep Voice 3: 2000-speaker neural text-to-speech publication-title: Proc Int Conf Learn Represent – year: 2015 ident: ref52 article-title: Adam: A method for stochastic optimization publication-title: Proc Int Conf Learn Represent – ident: ref27 doi: 10.1109/ICASSP.2018.8461829 – ident: ref41 doi: 10.1162/neco.1997.9.8.1735 – ident: ref6 doi: 10.1109/TASL.2007.907344 – ident: ref10 doi: 10.1016/0893-6080(89)90020-8 – ident: ref3 doi: 10.1109/ICASSP.1998.674423 – start-page: 3683 year: 0 ident: ref29 article-title: Fitting new speakers based on a short untranscribed sample publication-title: Proc Int Conf Mach Learn – start-page: 2266 year: 0 ident: ref54 article-title: Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation publication-title: Proc Int Conf Spoken Lang Process – ident: ref53 doi: 10.1109/ICASSP.1998.675407 – ident: ref8 doi: 10.1109/TASL.2010.2047683 – ident: ref45 doi: 10.1109/ICASSP.2018.8462020 – year: 2016 ident: ref31 article-title: Voice conversion with deep learning – ident: ref22 doi: 10.21437/Interspeech.2018-1830 – year: 1985 ident: ref9 publication-title: Statistical Analysis of Finite Mixture Distributions – ident: ref40 doi: 10.1109/ICASSP.2018.8461878 – year: 0 ident: ref55 article-title: Merlin: An open source neural network speech synthesis system publication-title: Proc ISCA Speech Synthesis Workshop – ident: ref36 doi: 10.1109/ICASSP.2018.8461948 – year: 2016 ident: ref48 publication-title: Pattern Recognition and Machine Learning – start-page: 125 year: 0 ident: ref34 article-title: WaveNet: A generative model for raw audio publication-title: Proc ISCA Speech Synthesis Workshop – ident: ref25 doi: 10.1109/ICASSP.2018.8461368 – start-page: 1138 year: 0 ident: ref35 article-title: Statistical voice conversion with WaveNet-based waveform generation publication-title: Proc Annu Conf Int Speech Commun Assoc – ident: ref7 doi: 10.1109/ICASSP.2009.4960478 – ident: ref20 doi: 10.18653/v1/D15-1166 – ident: ref12 doi: 10.1109/TASLP.2014.2353991 – year: 0 ident: ref51 article-title: Zoneout: Regularizing RNNs by randomly preserving hidden activations publication-title: Proc Int Conf Learn Represent – start-page: 1268 year: 0 ident: ref33 article-title: Voice conversion using sequence-to-sequence learning of context posterior probabilities publication-title: Proc Annu Conf Int Speech Commun Assoc – start-page: 4006 year: 0 ident: ref24 article-title: Tacotron: Towards end-to-end speech synthesis publication-title: Proc Annu Conf Int Speech Commun Assoc – start-page: 3104 year: 0 ident: ref17 article-title: Sequence to sequence learning with neural networks publication-title: Proc 27th Int Conf Neural Inf Process Syst – start-page: 1283 year: 0 ident: ref32 article-title: Sequence-to-sequence voice conversion with similarity metric learned using generative adversarial networks publication-title: Proc Annu Conf Int Speech Commun Assoc – ident: ref21 doi: 10.21437/Interspeech.2018-1190 – ident: ref16 doi: 10.1016/j.specom.2017.01.008 – ident: ref1 doi: 10.1109/ICASSP.1985.1168479 – year: 0 ident: ref30 article-title: VoiceLoop: Voice fitting and synthesis via a phonological loop publication-title: Proc Int Conf Learn Represent – ident: ref15 doi: 10.1109/ICSP.2016.7877819 – ident: ref42 doi: 10.1109/ICASSP.2016.7472621 – start-page: 6000 year: 0 ident: ref44 article-title: Attention is all you need publication-title: Proc Adv Neural Inf Process Syst – ident: ref4 doi: 10.1016/S0167-6393(99)00015-1 – year: 0 ident: ref19 article-title: Neural machine translation by jointly learning to align and translate publication-title: Proc Int Conf Learn Represent – ident: ref37 doi: 10.1587/transinf.2015EDP7457 – ident: ref18 doi: 10.3115/v1/D14-1179 – ident: ref5 doi: 10.1007/978-3-540-74048-3_4 – ident: ref38 doi: 10.1016/S0167-6393(98)00085-5 – ident: ref14 doi: 10.1109/TASLP.2014.2379589 – ident: ref11 doi: 10.1016/j.asoc.2012.05.027 – year: 0 ident: ref49 article-title: Empirical evaluation of recitified acitvations in convolutional network publication-title: Proc ICML Deep Learn Workshop – ident: ref47 doi: 10.1109/ICASSP.2014.6854321
SSID	ssj0001079974
Score	2.4924452
Snippet	In this paper, a neural network named sequence-to-sequence ConvErsion NeTwork (SCENT) is presented for acoustic modeling in voice conversion. At training...
SourceID	proquest crossref ieee
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	631
SubjectTerms	Ablation Acoustics Artificial neural networks attention Automatic speech recognition Cloning Conditioning Conversion Decoding Feature extraction Linguistics Mel-spectrogram Modelling Neural networks Probabilistic models sequence-to-sequence Spectrograms Speech processing Speech recognition Vocal tract Vocoders Voice conversion Voice recognition Waveforms
Title	Sequence-to-Sequence Acoustic Modeling for Voice Conversion
URI	https://ieeexplore.ieee.org/document/8607053 https://www.proquest.com/docview/2175415985
Volume	27
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 2329-9304 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0001079974 issn: 2329-9290 databaseCode: RIE dateStart: 20140101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED61nWDgVRCFgjKwgVMnduxYTFVFVSFASG1Rt8hxnAXUIEgXfj22k5SnEJsHO7LubN93l-_uAM6EYpmgQYyMP00R5SRAsVAhShmWlGcS55lj-d6xyZxeL6JFCy7WuTBaa0c-074dun_5WaFWNlQ2iJk5oBFpQ5tzUeVqfcRTMBfCFV02GEEgY_VxkyODxWA2nN7cWyKX8I2HYUxi9MUOucYqP15jZ2LG23DbbK5iljz6qzL11du3uo3_3f0ObNVY0xtWh2MXWnq5B5ufKhB24XJaU6lRWaBm7A1V4Zp8ebZTms1X9wy09R4K86h4I0tTdzG2fZiPr2ajCar7KSAViqhEUgqSUy6JZDjOlcEasdIZ0VrK0MAGFkiVhjHTlEcyZmZqbtBQJsI0kqmkLCcH0FkWS30InlYk1DKkGaGYamJWc52HKcEZ45gz0YOgkW6i6mLjtufFU-KcDiwSp5HEaiSpNdKD8_Wa56rUxp-zu1bE65m1dHvQb5SY1LfxNTFuV2SAioijo99XHcOG_XbFLetDp3xZ6RMDNsr01J2yd8YWziY
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07b9swED646ZBm6Mst6tRtNXRr6FDiS0Qmw6jhNrZRwHbhTaAoaklhFYm85NfnSElO-kDRjQMJEXck77vTd3cAH7WVheZxStCf5oQrFpNU24TkkhquCkPLIrB8l3K24V-3YtuDs0MujHMukM_cyA_Dv_yisnsfKjtPJR5QwR7BY4FehWqyte4jKlRpHcouI0rQBO0-7bJkqD5fj1fzb57KpUfoY6BRFL9YotBa5Y_3OBiZ6TNYdNtruCVXo32dj-ztb5Ub_3f_z-FpizajcXM8XkDP7V7CyYMahH24WLVkalJXpBtHY1uFNl-R75XmM9YjBLfR9wqflWjiieohyvYKNtPP68mMtB0ViE20qIkxmpVcGWYkTUuLaCO1rmDOGZMgcJCxsXmSSseVMKnEqSXioUInuTC54bJkr-FoV-3cG4icZYkzCS8Yp9wxXK1cmeSMFlJRJfUA4k66mW3LjfuuFz-y4HZQnQWNZF4jWauRAXw6rPnZFNv45-y-F_FhZivdAQw7JWbtfbzJ0PESCFV0Kk7_vuoDHM_Wi3k2_7K8fAtP_HcaptkQjurrvXuH0KPO34cTdwfIU9F3
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Sequence-to-Sequence+Acoustic+Modeling+for+Voice+Conversion&rft.jtitle=IEEE%2FACM+transactions+on+audio%2C+speech%2C+and+language+processing&rft.au=Jing-Xuan%2C+Zhang&rft.au=Zhen-Hua%2C+Ling&rft.au=Li-Juan%2C+Liu&rft.au=Jiang%2C+Yuan&rft.date=2019-03-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=2329-9290&rft.eissn=2329-9304&rft.volume=27&rft.issue=3&rft.spage=631&rft_id=info:doi/10.1109%2FTASLP.2019.2892235&rft.externalDBID=NO_FULL_TEXT
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2329-9290&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2329-9290&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2329-9290&client=summon