Decoding Visual Neural Representations by Multimodal Learning of Brain-Visual-Linguistic Features

Decoding human visual neural representations is a challenging task with great scientific significance in revealing vision-processing mechanisms and developing brain-like intelligent machines. Most existing methods are difficult to generalize to novel categories that have no corresponding neural data...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on pattern analysis and machine intelligence Vol. 45; no. 9; pp. 10760 - 10777
Main Authors	Du, Changde, Fu, Kaicheng, Li, Jinpeng, He, Huiguang
Format	Journal Article
Language	English
Published	United States IEEE 01.09.2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Brain Brain - diagnostic imaging Brain modeling Brain-visual-linguistic embedding Categories Decoding generic neural decoding Humans Internet Learning Linguistics Model accuracy multimodal learning mutual information maximization Online services Regularization Representations Semantics Training Visual Perception Visual stimuli Visualization
Online Access	Get full text
ISSN	0162-8828 1939-3539 2160-9292 1939-3539
DOI	10.1109/TPAMI.2023.3263181

Cover

Abstract	Decoding human visual neural representations is a challenging task with great scientific significance in revealing vision-processing mechanisms and developing brain-like intelligent machines. Most existing methods are difficult to generalize to novel categories that have no corresponding neural data for training. The two main reasons are 1) the under-exploitation of the multimodal semantic knowledge underlying the neural data and 2) the small number of paired ( stimuli-responses ) training data. To overcome these limitations, this paper presents a generic neural decoding method called BraVL that uses multimodal learning of brain-visual-linguistic features. We focus on modeling the relationships between brain, visual and linguistic features via multimodal deep generative models. Specifically, we leverage the mixture-of-product-of-experts formulation to infer a latent code that enables a coherent joint generation of all three modalities. To learn a more consistent joint representation and improve the data efficiency in the case of limited brain activity data, we exploit both intra- and inter-modality mutual information maximization regularization terms. In particular, our BraVL model can be trained under various semi-supervised scenarios to incorporate the visual and textual features obtained from the extra categories. Finally, we construct three trimodal matching datasets, and the extensive experiments lead to some interesting conclusions and cognitive insights: 1) decoding novel visual categories from human brain activity is practically possible with good accuracy; 2) decoding models using the combination of visual and linguistic features perform much better than those using either of them alone; 3) visual perception may be accompanied by linguistic influences to represent the semantics of visual stimuli.
AbstractList	Decoding human visual neural representations is a challenging task with great scientific significance in revealing vision-processing mechanisms and developing brain-like intelligent machines. Most existing methods are difficult to generalize to novel categories that have no corresponding neural data for training. The two main reasons are 1) the under-exploitation of the multimodal semantic knowledge underlying the neural data and 2) the small number of paired (stimuli-responses) training data. To overcome these limitations, this paper presents a generic neural decoding method called BraVL that uses multimodal learning of brain-visual-linguistic features. We focus on modeling the relationships between brain, visual and linguistic features via multimodal deep generative models. Specifically, we leverage the mixture-of-product-of-experts formulation to infer a latent code that enables a coherent joint generation of all three modalities. To learn a more consistent joint representation and improve the data efficiency in the case of limited brain activity data, we exploit both intra- and inter-modality mutual information maximization regularization terms. In particular, our BraVL model can be trained under various semi-supervised scenarios to incorporate the visual and textual features obtained from the extra categories. Finally, we construct three trimodal matching datasets, and the extensive experiments lead to some interesting conclusions and cognitive insights: 1) decoding novel visual categories from human brain activity is practically possible with good accuracy; 2) decoding models using the combination of visual and linguistic features perform much better than those using either of them alone; 3) visual perception may be accompanied by linguistic influences to represent the semantics of visual stimuli.Decoding human visual neural representations is a challenging task with great scientific significance in revealing vision-processing mechanisms and developing brain-like intelligent machines. Most existing methods are difficult to generalize to novel categories that have no corresponding neural data for training. The two main reasons are 1) the under-exploitation of the multimodal semantic knowledge underlying the neural data and 2) the small number of paired (stimuli-responses) training data. To overcome these limitations, this paper presents a generic neural decoding method called BraVL that uses multimodal learning of brain-visual-linguistic features. We focus on modeling the relationships between brain, visual and linguistic features via multimodal deep generative models. Specifically, we leverage the mixture-of-product-of-experts formulation to infer a latent code that enables a coherent joint generation of all three modalities. To learn a more consistent joint representation and improve the data efficiency in the case of limited brain activity data, we exploit both intra- and inter-modality mutual information maximization regularization terms. In particular, our BraVL model can be trained under various semi-supervised scenarios to incorporate the visual and textual features obtained from the extra categories. Finally, we construct three trimodal matching datasets, and the extensive experiments lead to some interesting conclusions and cognitive insights: 1) decoding novel visual categories from human brain activity is practically possible with good accuracy; 2) decoding models using the combination of visual and linguistic features perform much better than those using either of them alone; 3) visual perception may be accompanied by linguistic influences to represent the semantics of visual stimuli. Decoding human visual neural representations is a challenging task with great scientific significance in revealing vision-processing mechanisms and developing brain-like intelligent machines. Most existing methods are difficult to generalize to novel categories that have no corresponding neural data for training. The two main reasons are 1) the under-exploitation of the multimodal semantic knowledge underlying the neural data and 2) the small number of paired ( stimuli-responses ) training data. To overcome these limitations, this paper presents a generic neural decoding method called BraVL that uses multimodal learning of brain-visual-linguistic features. We focus on modeling the relationships between brain, visual and linguistic features via multimodal deep generative models. Specifically, we leverage the mixture-of-product-of-experts formulation to infer a latent code that enables a coherent joint generation of all three modalities. To learn a more consistent joint representation and improve the data efficiency in the case of limited brain activity data, we exploit both intra- and inter-modality mutual information maximization regularization terms. In particular, our BraVL model can be trained under various semi-supervised scenarios to incorporate the visual and textual features obtained from the extra categories. Finally, we construct three trimodal matching datasets, and the extensive experiments lead to some interesting conclusions and cognitive insights: 1) decoding novel visual categories from human brain activity is practically possible with good accuracy; 2) decoding models using the combination of visual and linguistic features perform much better than those using either of them alone; 3) visual perception may be accompanied by linguistic influences to represent the semantics of visual stimuli.
Author	Fu, Kaicheng He, Huiguang Li, Jinpeng Du, Changde
Author_xml	– sequence: 1 givenname: Changde orcidid: 0000-0002-0084-433X surname: Du fullname: Du, Changde email: changde.du@ia.ac.cn organization: Research Center for Brain-Inspired Intelligence, State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China – sequence: 2 givenname: Kaicheng surname: Fu fullname: Fu, Kaicheng email: fukaicheng2019@ia.ac.cn organization: Research Center for Brain-Inspired Intelligence, State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China – sequence: 3 givenname: Jinpeng orcidid: 0000-0001-8701-2642 surname: Li fullname: Li, Jinpeng email: lijinpeng@ucas.ac.cn organization: Ningbo HwaMei Hospital, UCAS, Zhejiang, China – sequence: 4 givenname: Huiguang orcidid: 0000-0002-0684-1711 surname: He fullname: He, Huiguang email: huiguang.he@ia.ac.cn organization: Research Center for Brain-Inspired Intelligence, State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China
BackLink	https://www.ncbi.nlm.nih.gov/pubmed/37030711$$D View this record in MEDLINE/PubMed
BookMark	eNp9kU9rVDEUxYNU7LT6BUTkgRs3b8xN3p9kWavVwlRFqtuQSW4k5U0yJi-LfnszvilIF64u3JzfyeWcM3ISYkBCXgJdA1D57vbbxc31mlHG15wNHAQ8ISsGA20lk-yErCgMrBWCiVNylvMdpdD1lD8jp3yknI4AK6I_oInWh1_NT5-LnpovWFId33GfMGOY9exjyM32vrkp0-x30dbXDeoUDlB0zfukfWgXut3UZfF59qa5Qj2X6vGcPHV6yvjiOM_Jj6uPt5ef283XT9eXF5vWdCDm1vWsA74VjImOu0Fo0zldTxfOOtlTA9oxZ6HvuWMgRkGFscZyi8xK0GLg5-Tt4rtP8XfBPKudzwanSQeMJSs2SjFCR0depW8eSe9iSaFep-rvo-zHgdGqen1Ule0Ordonv9PpXj2EVwViEZgUc07olPFLXnPNZFJA1aEn9bcndehJHXuqKHuEPrj_F3q1QB4R_wGokCAp_wML3p26
CODEN	ITPIDJ
CitedBy_id	crossref_primary_10_1126_sciadv_adm8430 crossref_primary_10_1109_TIM_2024_3480232 crossref_primary_10_1111_cns_14615 crossref_primary_10_1016_j_ipm_2024_103772 crossref_primary_10_1002_hbm_26500 crossref_primary_10_1142_S0129065725500091 crossref_primary_10_3390_brainsci14050478 crossref_primary_10_1038_s41598_024_63651_2 crossref_primary_10_1016_j_inffus_2025_103021 crossref_primary_10_1109_TCDS_2024_3370261 crossref_primary_10_1177_08953996241296400 crossref_primary_10_1007_s10489_023_05165_4 crossref_primary_10_1016_j_cmpb_2024_108213 crossref_primary_10_1007_s12559_024_10329_6 crossref_primary_10_1016_j_nlp_2023_100026 crossref_primary_10_1016_j_sna_2025_116354 crossref_primary_10_3390_s23156903
Cites_doi	10.1109/TSP.2020.3028701 10.1016/j.eng.2019.03.010 10.1109/TNNLS.2020.3028167 10.1038/ncomms15037 10.1016/j.inffus.2020.11.003 10.18653/v1/2021.naacl-main.250 10.18653/v1/D17-1113 10.1109/ICCV.2015.474 10.1038/s41467-022-30761-2 10.1109/ICCV.2017.386 10.18653/v1/D19-1050 10.4324/9781315798868 10.18653/v1/K16-1002 10.1016/j.tics.2021.07.006 10.1016/j.neuroimage.2017.08.017 10.18653/v1/S19-1013 10.3389/fninf.2015.00023 10.1109/CVPR46437.2021.00384 10.1002/hbm.22087 10.1167/19.10.209a 10.1109/TPAMI.2020.2995909 10.1162/tacl_a_00043 10.1109/TPAMI.2018.2857768 10.1111/j.1551-6709.2010.01106.x 10.1126/science.1063736 10.1371/journal.pone.0223792 10.1016/j.neuroimage.2022.119754 10.1016/j.tics.2020.08.005 10.1038/nn1444 10.1371/journal.pcbi.1006633 10.1109/CVPR42600.2020.00975 10.1109/CVPR.2017.479 10.1038/s41467-018-03068-4 10.1109/TNNLS.2018.2882456 10.1073/pnas.2105646118 10.1016/j.neuron.2012.01.010 10.1038/s41593-021-00921-6 10.1038/nature17637 10.18653/v1/2022.naacl-main.235 10.1109/CVPR.2018.00450 10.1016/j.neuroimage.2017.06.042 10.1109/TPAMI.2019.2909031 10.1038/s41467-019-08848-0 10.1109/CVPR.2019.01052 10.1109/CVPR.2019.00844 10.1016/j.neuroimage.2009.11.064 10.1038/nature06713 10.1109/CVPR46437.2021.01352 10.1109/CVPR.2016.15 10.1126/science.1152876 10.3115/v1/D14-1032
ContentType	Journal Article
Copyright	Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
Copyright_xml	– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
DBID	97E RIA RIE AAYXX CITATION CGR CUY CVF ECM EIF NPM 7SC 7SP 8FD JQ2 L7M L~C L~D 7X8
DOI	10.1109/TPAMI.2023.3263181
DatabaseName	IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional MEDLINE - Academic
DatabaseTitle	CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional MEDLINE - Academic
DatabaseTitleList	MEDLINE - Academic Technology Research Database MEDLINE
Database_xml	– sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database – sequence: 3 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering Computer Science
EISSN	2160-9292 1939-3539
EndPage	10777
ExternalDocumentID	37030711 10_1109_TPAMI_2023_3263181 10089190
Genre	orig-research Research Support, Non-U.S. Gov't Journal Article
GrantInformation_xml	– fundername: National Natural Science Foundation of China grantid: 62206284; 62020106015; 61976209 funderid: 10.13039/501100001809 – fundername: CAAI-Huawei MindSpore Open Fund – fundername: National Key R&D Program of China grantid: 2022ZD0116500 – fundername: Strategic Priority Research Program of Chinese Academy of Sciences grantid: XDB32040000
GroupedDBID	--- -DZ -~X .DC 0R~ 29I 4.4 53G 5GY 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK ACNCT AENEX AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD F5P HZ~ IEDLZ IFIPE IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIE RNS RXW TAE TN5 UHB ~02 AAYXX CITATION 5VS 9M8 AAYOK ABFSI ADRHT AETEA AETIX AGSQL AI. AIBXA ALLEH CGR CUY CVF ECM EIF FA8 H~9 IBMZZ ICLAB IFJZH NPM RIG RNI RZB VH1 XJT 7SC 7SP 8FD JQ2 L7M L~C L~D 7X8
ID	FETCH-LOGICAL-c418t-f52413b822843f68ac4fa8288fdf950c1af2fd1553f2187808cdcd3de2d91a863
IEDL.DBID	RIE
ISSN	0162-8828 1939-3539
IngestDate	Thu Oct 02 07:37:03 EDT 2025 Mon Jun 30 02:21:48 EDT 2025 Sun Apr 06 01:21:18 EDT 2025 Wed Oct 01 02:24:12 EDT 2025 Thu Apr 24 22:50:52 EDT 2025 Wed Aug 27 02:46:14 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Issue	9
Language	English
License	https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c418t-f52413b822843f68ac4fa8288fdf950c1af2fd1553f2187808cdcd3de2d91a863
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ORCID	0000-0002-0084-433X 0000-0001-8701-2642 0000-0002-0684-1711
PMID	37030711
PQID	2847957620
PQPubID	85458
PageCount	18
ParticipantIDs	crossref_citationtrail_10_1109_TPAMI_2023_3263181 proquest_miscellaneous_2798714073 ieee_primary_10089190 pubmed_primary_37030711 crossref_primary_10_1109_TPAMI_2023_3263181 proquest_journals_2847957620
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2023-09-01
PublicationDateYYYYMMDD	2023-09-01
PublicationDate_xml	– month: 09 year: 2023 text: 2023-09-01 day: 01
PublicationDecade	2020
PublicationPlace	United States
PublicationPlace_xml	– name: United States – name: New York
PublicationTitle	IEEE transactions on pattern analysis and machine intelligence
PublicationTitleAbbrev	TPAMI
PublicationTitleAlternate	IEEE Trans Pattern Anal Mach Intell
PublicationYear	2023
Publisher	IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml	– name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References	ref12 ref15 ref14 ref58 ref53 ref11 ref55 ref10 ref54 ref16 ref18 kubilius (ref74) 2019 radford (ref13) 2021 devlin (ref57) 2019 sutter (ref19) 2021 belghazi (ref50) 2018 ref45 ref48 ref42 ref41 alemi (ref46) 2016 ref44 ref43 nonaka (ref81) 2021; 24 romera-paredes (ref37) 2015 lan (ref59) 2020 bachman (ref47) 2019 ref49 shi (ref66) 2019 ref8 ref7 black (ref60) 2021 ref9 cheng (ref51) 2020 ref4 ref3 ref6 vaswani (ref56) 2017 palatucci (ref5) 0 ref40 wu (ref83) 2023 li (ref75) 2022 kingma (ref76) 2014 ref80 ref35 ref34 ref78 ref31 ref30 ref33 ref77 ref32 ref2 ref1 ref38 agakov (ref62) 2004; 16 van der maaten (ref79) 2008; 9 (ref84) 2023 rezende (ref64) 2014 poole (ref52) 2019 ref73 ref72 ref24 ref68 ref23 ref26 ref25 kingma (ref63) 2014 ref20 ref22 wu (ref65) 2018 chen (ref69) 2017 ref21 brown (ref82) 2020 frome (ref36) 2013 ref28 ref27 bujwid (ref17) 2021 ref29 alemi (ref67) 2018 dieng (ref71) 0 chen (ref70) 2016 zhang (ref39) 2019 ref61
References_xml	– start-page: 1 year: 2019 ident: ref47 article-title: Learning representations by maximizing mutual information across views publication-title: Proc Int Conf Neural Inf Process – ident: ref28 doi: 10.1109/TSP.2020.3028701 – start-page: 5171 year: 2019 ident: ref52 article-title: On variational bounds of mutual information publication-title: Proc Annu Int Conf Mach Learn – volume: 24 year: 2021 ident: ref81 article-title: Brain hierarchy score: Which deep neural networks are hierarchically brain-like? publication-title: Science – start-page: 5580 year: 2018 ident: ref65 article-title: Multimodal generative models for scalable weakly-supervised learning publication-title: Proc Int Conf Neural Inf Process – ident: ref80 doi: 10.1016/j.eng.2019.03.010 – start-page: 2732 year: 0 ident: ref71 article-title: Variational inference via ? upper bound minimization publication-title: Proc Int Conf Neural Inf Process – start-page: 1 year: 2016 ident: ref46 article-title: Deep variational information bottleneck publication-title: Proc Int Conf Learn Representations – ident: ref9 doi: 10.1109/TNNLS.2020.3028167 – ident: ref6 doi: 10.1038/ncomms15037 – ident: ref27 doi: 10.1016/j.inffus.2020.11.003 – ident: ref18 doi: 10.18653/v1/2021.naacl-main.250 – start-page: 1779 year: 2020 ident: ref51 article-title: CLUB: A contrastive log-ratio upper bound of mutual information publication-title: Proc Annu Int Conf Mach Learn – ident: ref44 doi: 10.18653/v1/D17-1113 – start-page: 1 year: 2019 ident: ref74 article-title: Brain-like object recognition with high-performing shallow recurrent ANNs publication-title: Proc Adv Neural Inf Process Syst – ident: ref40 doi: 10.1109/ICCV.2015.474 – year: 2014 ident: ref76 article-title: Adam: A method for stochastic optimization – start-page: 15692 year: 2019 ident: ref66 article-title: Variational mixture-of-experts autoencoders for multi-modal deep generative models publication-title: Proc Int Conf Neural Inf Process – ident: ref14 doi: 10.1038/s41467-022-30761-2 – ident: ref32 doi: 10.1109/ICCV.2017.386 – ident: ref58 doi: 10.18653/v1/D19-1050 – start-page: 1 year: 0 ident: ref5 article-title: Zero-shot learning with semantic output codes publication-title: Proc Int Conf Neural Inf Process – year: 2023 ident: ref83 article-title: Visual ChatGPT: Talking, drawing and editing with visual foundation models – ident: ref11 doi: 10.4324/9781315798868 – ident: ref68 doi: 10.18653/v1/K16-1002 – start-page: 8748 year: 2021 ident: ref13 article-title: Learning transferable visual models from natural language supervision publication-title: Proc Annu Int Conf Mach Learn – ident: ref12 doi: 10.1016/j.tics.2021.07.006 – ident: ref15 doi: 10.1016/j.neuroimage.2017.08.017 – ident: ref45 doi: 10.18653/v1/S19-1013 – ident: ref72 doi: 10.3389/fninf.2015.00023 – start-page: 2121 year: 2013 ident: ref36 article-title: Devise: A deep visual-semantic embedding model publication-title: Proc Adv Neural Inf Process Syst – volume: 9 start-page: 2579 year: 2008 ident: ref79 article-title: Visualizing data using t-SNE. publication-title: J Mach Learn Res – ident: ref25 doi: 10.1109/CVPR46437.2021.00384 – ident: ref24 doi: 10.1002/hbm.22087 – start-page: 1 year: 2021 ident: ref19 article-title: Generalized multimodal ELBO publication-title: Proc Int Conf Learn Representations – ident: ref8 doi: 10.1167/19.10.209a – start-page: 38 year: 2021 ident: ref17 article-title: Large-scale zero-shot image classification from rich and diverse textual descriptions publication-title: Proc Workshop Beyond Vis Lang Integrating Real-World Knowl – ident: ref3 doi: 10.1109/TPAMI.2020.2995909 – volume: 16 start-page: 201 year: 2004 ident: ref62 article-title: The IM algorithm: A variational approach to information maximization publication-title: Proc Adv Neural Inf Process Syst – start-page: 1 year: 2016 ident: ref70 article-title: InfoGan: Interpretable representation learning by information maximizing generative adversarial nets publication-title: Proc Adv Neural Inf Process Syst – start-page: 1 year: 2022 ident: ref75 article-title: BLIP: Bootstrapping language-image pre-training for unified vision-language understanding and generation publication-title: Proc Annu Int Conf Mach Learn – start-page: 5998 year: 2017 ident: ref56 article-title: Attention is all you need publication-title: Proc Adv Neural Inf Process Syst – start-page: 159 year: 2018 ident: ref67 article-title: Fixing a broken ELBO publication-title: Proc Annu Int Conf Mach Learn – start-page: 4171 year: 2019 ident: ref57 article-title: BERT: Pre-training of deep bidirectional transformers for language understanding publication-title: Proc Conf North Amer Chapter Assoc Comput Linguistics - Hum Lang Technol – ident: ref43 doi: 10.1162/tacl_a_00043 – ident: ref33 doi: 10.1109/TPAMI.2018.2857768 – ident: ref54 doi: 10.1111/j.1551-6709.2010.01106.x – ident: ref22 doi: 10.1126/science.1063736 – ident: ref73 doi: 10.1371/journal.pone.0223792 – start-page: 1 year: 2017 ident: ref69 article-title: Variational lossy autoencoder publication-title: Proc Int Conf Learn Representations – start-page: 1278 year: 2014 ident: ref64 article-title: Stochastic backpropagation and approximate inference in deep generative models publication-title: Proc Annu Int Conf Mach Learn – ident: ref21 doi: 10.1016/j.neuroimage.2022.119754 – year: 2023 ident: ref84 article-title: GPT-4 technical report – start-page: 1 year: 2014 ident: ref63 article-title: Auto-encoding variational Bayes publication-title: Proc Int Conf Learn Representations – ident: ref10 doi: 10.1016/j.tics.2020.08.005 – ident: ref1 doi: 10.1038/nn1444 – ident: ref20 doi: 10.1371/journal.pcbi.1006633 – ident: ref48 doi: 10.1109/CVPR42600.2020.00975 – start-page: 2152 year: 2015 ident: ref37 article-title: An embarrassingly simple approach to zero-shot learning publication-title: Proc Annu Int Conf Mach Learn – ident: ref26 doi: 10.1109/CVPR.2017.479 – ident: ref30 doi: 10.1038/s41467-018-03068-4 – ident: ref2 doi: 10.1109/TNNLS.2018.2882456 – ident: ref55 doi: 10.1073/pnas.2105646118 – ident: ref77 doi: 10.1016/j.neuron.2012.01.010 – start-page: 531 year: 2018 ident: ref50 article-title: Mutual information neural estimation publication-title: Proc Annu Int Conf Mach Learn – ident: ref78 doi: 10.1038/s41593-021-00921-6 – ident: ref29 doi: 10.1038/nature17637 – year: 2021 ident: ref60 article-title: GPT-Neo: Large scale autoregressive language modeling with mesh-tensorflow – ident: ref61 doi: 10.18653/v1/2022.naacl-main.235 – ident: ref38 doi: 10.1109/CVPR.2018.00450 – ident: ref16 doi: 10.1016/j.neuroimage.2017.06.042 – ident: ref49 doi: 10.1109/TPAMI.2019.2909031 – ident: ref31 doi: 10.1038/s41467-019-08848-0 – start-page: 7434 year: 2019 ident: ref39 article-title: Co-representation network for generalized zero-shot learning publication-title: Proc Annu Int Conf Mach Learn – ident: ref41 doi: 10.1109/CVPR.2019.01052 – ident: ref35 doi: 10.1109/CVPR.2019.00844 – ident: ref23 doi: 10.1016/j.neuroimage.2009.11.064 – start-page: 1877 year: 2020 ident: ref82 article-title: Language models are few-shot learners publication-title: Proc Adv Neural Inf Process Syst – ident: ref7 doi: 10.1038/nature06713 – ident: ref53 doi: 10.1109/CVPR46437.2021.01352 – ident: ref34 doi: 10.1109/CVPR.2016.15 – start-page: 1 year: 2020 ident: ref59 article-title: ALBERT: A lite BERT for self-supervised learning of language representations publication-title: Proc Int Conf Learn Representations – ident: ref4 doi: 10.1126/science.1152876 – ident: ref42 doi: 10.3115/v1/D14-1032
SSID	ssj0014503
Score	2.672787
Snippet	Decoding human visual neural representations is a challenging task with great scientific significance in revealing vision-processing mechanisms and developing...
SourceID	proquest pubmed crossref ieee
SourceType	Aggregation Database Index Database Enrichment Source Publisher
StartPage	10760
SubjectTerms	Algorithms Brain Brain - diagnostic imaging Brain modeling Brain-visual-linguistic embedding Categories Decoding generic neural decoding Humans Internet Learning Linguistics Model accuracy multimodal learning mutual information maximization Online services Regularization Representations Semantics Training Visual Perception Visual stimuli Visualization
Title	Decoding Visual Neural Representations by Multimodal Learning of Brain-Visual-Linguistic Features
URI	https://ieeexplore.ieee.org/document/10089190 https://www.ncbi.nlm.nih.gov/pubmed/37030711 https://www.proquest.com/docview/2847957620 https://www.proquest.com/docview/2798714073
Volume	45
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 2160-9292 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014503 issn: 0162-8828 databaseCode: RIE dateStart: 19790101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9wwEB6VPbWHLtBtWV5ypd6qhGTjJPaRp2glUFWxFbco8QMhIEFscoBfz4zjrGglKk6JZDuvmdHMZDzfB_DN6gq9eMSDCn1TgJYogkootCtuklJIbrnrSjs7z07n_Odleumb1V0vjDHGbT4zIZ26Wr5uVEe_yvYIiEaiB1uBlVxkfbPWsmTAU0eDjCEMmjjmEUOHTCT3Ln7tn_0IiSg8xGgFtZj4YRIHhRXHfzkkx7DyerDpnM7JGM6Hx-33mtyEXVuF6ukfJMc3v88qfPThJ9vv9WUN3pl6HcYDtQPzlr4OH17gFH6C8giTVHJy7M_1osP1BOmBh99uG63vXqoXrHpkrqH3rtE46qFbr1hj2QExUQT96gDz36vOAUQzikA7vMYE5ifHF4engedmCBSPRRvYlApyFYYXgic2E6XitsSvLqy2Mo1UXNqZ1URKZDGIyEUklFY60WamZVyKLPkMo7qpzQawVBuNWSFV6BRXiuizskxYyTGSKaMqnUI8CKhQHric-DNuC5fARLJw8i1IvoWX7xS-L9fc97Ad_509IeG8mNnLZQrbgyIU3rQXBflziVnaDIe_LofRKKnSUtam6XBOLgUhIebJFL70CrS8-KB3m6_cdAve07P1-9i2YdQ-dGYHA5-22nUK_wzp5fpU
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9tAEB4VeoAeoARo09KylXpDNna8dnaP9IFCSyJUJRU3y94Hqgp2ReID_PrOrNdRWilVT7a0u37NjGbGs_N9AO-tLtGLRzwo0TcFaIkiKIVCu-ImKYTklruutPEkG834l-v02jeru14YY4zbfGZCOnW1fF2rhn6VnRIQjUQPtgFPU8552rZrLYsGPHVEyBjEoJFjJtH1yETydHp1Nr4IiSo8xHgF9ZgYYhIHhhXHf7gkx7GyPtx0bud8FybdA7e7TX6GzaIM1eNfWI7__UbPYccHoOys1Zg9eGKqHux25A7M23oPnq0gFe5D8QnTVHJz7PuPeYPrCdQDD9_cRlrfv1TNWfnAXEvvXa1x1IO33rDasg_ERRG0qwPMgG8aBxHNKAZt8BoHMDv_PP04Cjw7Q6B4LBaBTakkV2KAIXhiM1Eobgv86sJqK9NIxYUdWE20RBbDiKGIhNJKJ9oMtIwLkSWHsFnVlXkJLNVGY15INTrFlSICrSwTVnKMZYqoTPsQdwLKlYcuJwaN29ylMJHMnXxzkm_u5duHk-WaXy1wxz9nH5BwVma2cunDUacIuTfueU4eXWKeNsDhd8thNEuqtRSVqRucM5SCsBCHSR9etAq0vHind6_W3PQYtkbT8WV-eTH5-hq26TnbXW1HsLm4b8wbDIMW5Vun_L8BKrv9oQ
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Decoding+Visual+Neural+Representations+by+Multimodal+Learning+of+Brain-Visual-Linguistic+Features&rft.jtitle=IEEE+transactions+on+pattern+analysis+and+machine+intelligence&rft.au=Du%2C+Changde&rft.au=Fu%2C+Kaicheng&rft.au=Li%2C+Jinpeng&rft.au=He%2C+Huiguang&rft.date=2023-09-01&rft.issn=1939-3539&rft.eissn=1939-3539&rft.volume=45&rft.issue=9&rft.spage=10760&rft_id=info:doi/10.1109%2FTPAMI.2023.3263181&rft.externalDBID=NO_FULL_TEXT
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0162-8828&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0162-8828&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0162-8828&client=summon