Decoding Visual Neural Representations by Multimodal Learning of Brain-Visual-Linguistic Features

Decoding human visual neural representations is a challenging task with great scientific significance in revealing vision-processing mechanisms and developing brain-like intelligent machines. Most existing methods are difficult to generalize to novel categories that have no corresponding neural data...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on pattern analysis and machine intelligence Vol. 45; no. 9; pp. 10760 - 10777
Main Authors Du, Changde, Fu, Kaicheng, Li, Jinpeng, He, Huiguang
Format Journal Article
LanguageEnglish
Published United States IEEE 01.09.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN0162-8828
1939-3539
2160-9292
1939-3539
DOI10.1109/TPAMI.2023.3263181

Cover

Abstract Decoding human visual neural representations is a challenging task with great scientific significance in revealing vision-processing mechanisms and developing brain-like intelligent machines. Most existing methods are difficult to generalize to novel categories that have no corresponding neural data for training. The two main reasons are 1) the under-exploitation of the multimodal semantic knowledge underlying the neural data and 2) the small number of paired ( stimuli-responses ) training data. To overcome these limitations, this paper presents a generic neural decoding method called BraVL that uses multimodal learning of brain-visual-linguistic features. We focus on modeling the relationships between brain, visual and linguistic features via multimodal deep generative models. Specifically, we leverage the mixture-of-product-of-experts formulation to infer a latent code that enables a coherent joint generation of all three modalities. To learn a more consistent joint representation and improve the data efficiency in the case of limited brain activity data, we exploit both intra- and inter-modality mutual information maximization regularization terms. In particular, our BraVL model can be trained under various semi-supervised scenarios to incorporate the visual and textual features obtained from the extra categories. Finally, we construct three trimodal matching datasets, and the extensive experiments lead to some interesting conclusions and cognitive insights: 1) decoding novel visual categories from human brain activity is practically possible with good accuracy; 2) decoding models using the combination of visual and linguistic features perform much better than those using either of them alone; 3) visual perception may be accompanied by linguistic influences to represent the semantics of visual stimuli.
AbstractList Decoding human visual neural representations is a challenging task with great scientific significance in revealing vision-processing mechanisms and developing brain-like intelligent machines. Most existing methods are difficult to generalize to novel categories that have no corresponding neural data for training. The two main reasons are 1) the under-exploitation of the multimodal semantic knowledge underlying the neural data and 2) the small number of paired (stimuli-responses) training data. To overcome these limitations, this paper presents a generic neural decoding method called BraVL that uses multimodal learning of brain-visual-linguistic features. We focus on modeling the relationships between brain, visual and linguistic features via multimodal deep generative models. Specifically, we leverage the mixture-of-product-of-experts formulation to infer a latent code that enables a coherent joint generation of all three modalities. To learn a more consistent joint representation and improve the data efficiency in the case of limited brain activity data, we exploit both intra- and inter-modality mutual information maximization regularization terms. In particular, our BraVL model can be trained under various semi-supervised scenarios to incorporate the visual and textual features obtained from the extra categories. Finally, we construct three trimodal matching datasets, and the extensive experiments lead to some interesting conclusions and cognitive insights: 1) decoding novel visual categories from human brain activity is practically possible with good accuracy; 2) decoding models using the combination of visual and linguistic features perform much better than those using either of them alone; 3) visual perception may be accompanied by linguistic influences to represent the semantics of visual stimuli.Decoding human visual neural representations is a challenging task with great scientific significance in revealing vision-processing mechanisms and developing brain-like intelligent machines. Most existing methods are difficult to generalize to novel categories that have no corresponding neural data for training. The two main reasons are 1) the under-exploitation of the multimodal semantic knowledge underlying the neural data and 2) the small number of paired (stimuli-responses) training data. To overcome these limitations, this paper presents a generic neural decoding method called BraVL that uses multimodal learning of brain-visual-linguistic features. We focus on modeling the relationships between brain, visual and linguistic features via multimodal deep generative models. Specifically, we leverage the mixture-of-product-of-experts formulation to infer a latent code that enables a coherent joint generation of all three modalities. To learn a more consistent joint representation and improve the data efficiency in the case of limited brain activity data, we exploit both intra- and inter-modality mutual information maximization regularization terms. In particular, our BraVL model can be trained under various semi-supervised scenarios to incorporate the visual and textual features obtained from the extra categories. Finally, we construct three trimodal matching datasets, and the extensive experiments lead to some interesting conclusions and cognitive insights: 1) decoding novel visual categories from human brain activity is practically possible with good accuracy; 2) decoding models using the combination of visual and linguistic features perform much better than those using either of them alone; 3) visual perception may be accompanied by linguistic influences to represent the semantics of visual stimuli.
Decoding human visual neural representations is a challenging task with great scientific significance in revealing vision-processing mechanisms and developing brain-like intelligent machines. Most existing methods are difficult to generalize to novel categories that have no corresponding neural data for training. The two main reasons are 1) the under-exploitation of the multimodal semantic knowledge underlying the neural data and 2) the small number of paired ( stimuli-responses ) training data. To overcome these limitations, this paper presents a generic neural decoding method called BraVL that uses multimodal learning of brain-visual-linguistic features. We focus on modeling the relationships between brain, visual and linguistic features via multimodal deep generative models. Specifically, we leverage the mixture-of-product-of-experts formulation to infer a latent code that enables a coherent joint generation of all three modalities. To learn a more consistent joint representation and improve the data efficiency in the case of limited brain activity data, we exploit both intra- and inter-modality mutual information maximization regularization terms. In particular, our BraVL model can be trained under various semi-supervised scenarios to incorporate the visual and textual features obtained from the extra categories. Finally, we construct three trimodal matching datasets, and the extensive experiments lead to some interesting conclusions and cognitive insights: 1) decoding novel visual categories from human brain activity is practically possible with good accuracy; 2) decoding models using the combination of visual and linguistic features perform much better than those using either of them alone; 3) visual perception may be accompanied by linguistic influences to represent the semantics of visual stimuli.
Author Fu, Kaicheng
He, Huiguang
Li, Jinpeng
Du, Changde
Author_xml – sequence: 1
  givenname: Changde
  orcidid: 0000-0002-0084-433X
  surname: Du
  fullname: Du, Changde
  email: changde.du@ia.ac.cn
  organization: Research Center for Brain-Inspired Intelligence, State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China
– sequence: 2
  givenname: Kaicheng
  surname: Fu
  fullname: Fu, Kaicheng
  email: fukaicheng2019@ia.ac.cn
  organization: Research Center for Brain-Inspired Intelligence, State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China
– sequence: 3
  givenname: Jinpeng
  orcidid: 0000-0001-8701-2642
  surname: Li
  fullname: Li, Jinpeng
  email: lijinpeng@ucas.ac.cn
  organization: Ningbo HwaMei Hospital, UCAS, Zhejiang, China
– sequence: 4
  givenname: Huiguang
  orcidid: 0000-0002-0684-1711
  surname: He
  fullname: He, Huiguang
  email: huiguang.he@ia.ac.cn
  organization: Research Center for Brain-Inspired Intelligence, State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China
BackLink https://www.ncbi.nlm.nih.gov/pubmed/37030711$$D View this record in MEDLINE/PubMed
BookMark eNp9kU9rVDEUxYNU7LT6BUTkgRs3b8xN3p9kWavVwlRFqtuQSW4k5U0yJi-LfnszvilIF64u3JzfyeWcM3ISYkBCXgJdA1D57vbbxc31mlHG15wNHAQ8ISsGA20lk-yErCgMrBWCiVNylvMdpdD1lD8jp3yknI4AK6I_oInWh1_NT5-LnpovWFId33GfMGOY9exjyM32vrkp0-x30dbXDeoUDlB0zfukfWgXut3UZfF59qa5Qj2X6vGcPHV6yvjiOM_Jj6uPt5ef283XT9eXF5vWdCDm1vWsA74VjImOu0Fo0zldTxfOOtlTA9oxZ6HvuWMgRkGFscZyi8xK0GLg5-Tt4rtP8XfBPKudzwanSQeMJSs2SjFCR0depW8eSe9iSaFep-rvo-zHgdGqen1Ule0Ordonv9PpXj2EVwViEZgUc07olPFLXnPNZFJA1aEn9bcndehJHXuqKHuEPrj_F3q1QB4R_wGokCAp_wML3p26
CODEN ITPIDJ
CitedBy_id crossref_primary_10_1126_sciadv_adm8430
crossref_primary_10_1109_TIM_2024_3480232
crossref_primary_10_1111_cns_14615
crossref_primary_10_1016_j_ipm_2024_103772
crossref_primary_10_1002_hbm_26500
crossref_primary_10_1142_S0129065725500091
crossref_primary_10_3390_brainsci14050478
crossref_primary_10_1038_s41598_024_63651_2
crossref_primary_10_1016_j_inffus_2025_103021
crossref_primary_10_1109_TCDS_2024_3370261
crossref_primary_10_1177_08953996241296400
crossref_primary_10_1007_s10489_023_05165_4
crossref_primary_10_1016_j_cmpb_2024_108213
crossref_primary_10_1007_s12559_024_10329_6
crossref_primary_10_1016_j_nlp_2023_100026
crossref_primary_10_1016_j_sna_2025_116354
crossref_primary_10_3390_s23156903
Cites_doi 10.1109/TSP.2020.3028701
10.1016/j.eng.2019.03.010
10.1109/TNNLS.2020.3028167
10.1038/ncomms15037
10.1016/j.inffus.2020.11.003
10.18653/v1/2021.naacl-main.250
10.18653/v1/D17-1113
10.1109/ICCV.2015.474
10.1038/s41467-022-30761-2
10.1109/ICCV.2017.386
10.18653/v1/D19-1050
10.4324/9781315798868
10.18653/v1/K16-1002
10.1016/j.tics.2021.07.006
10.1016/j.neuroimage.2017.08.017
10.18653/v1/S19-1013
10.3389/fninf.2015.00023
10.1109/CVPR46437.2021.00384
10.1002/hbm.22087
10.1167/19.10.209a
10.1109/TPAMI.2020.2995909
10.1162/tacl_a_00043
10.1109/TPAMI.2018.2857768
10.1111/j.1551-6709.2010.01106.x
10.1126/science.1063736
10.1371/journal.pone.0223792
10.1016/j.neuroimage.2022.119754
10.1016/j.tics.2020.08.005
10.1038/nn1444
10.1371/journal.pcbi.1006633
10.1109/CVPR42600.2020.00975
10.1109/CVPR.2017.479
10.1038/s41467-018-03068-4
10.1109/TNNLS.2018.2882456
10.1073/pnas.2105646118
10.1016/j.neuron.2012.01.010
10.1038/s41593-021-00921-6
10.1038/nature17637
10.18653/v1/2022.naacl-main.235
10.1109/CVPR.2018.00450
10.1016/j.neuroimage.2017.06.042
10.1109/TPAMI.2019.2909031
10.1038/s41467-019-08848-0
10.1109/CVPR.2019.01052
10.1109/CVPR.2019.00844
10.1016/j.neuroimage.2009.11.064
10.1038/nature06713
10.1109/CVPR46437.2021.01352
10.1109/CVPR.2016.15
10.1126/science.1152876
10.3115/v1/D14-1032
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
DBID 97E
RIA
RIE
AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
7X8
DOI 10.1109/TPAMI.2023.3263181
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
MEDLINE - Academic
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic

Technology Research Database
MEDLINE
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
– sequence: 3
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 2160-9292
1939-3539
EndPage 10777
ExternalDocumentID 37030711
10_1109_TPAMI_2023_3263181
10089190
Genre orig-research
Research Support, Non-U.S. Gov't
Journal Article
GrantInformation_xml – fundername: National Natural Science Foundation of China
  grantid: 62206284; 62020106015; 61976209
  funderid: 10.13039/501100001809
– fundername: CAAI-Huawei MindSpore Open Fund
– fundername: National Key R&D Program of China
  grantid: 2022ZD0116500
– fundername: Strategic Priority Research Program of Chinese Academy of Sciences
  grantid: XDB32040000
GroupedDBID ---
-DZ
-~X
.DC
0R~
29I
4.4
53G
5GY
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
ACNCT
AENEX
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
E.L
EBS
EJD
F5P
HZ~
IEDLZ
IFIPE
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNS
RXW
TAE
TN5
UHB
~02
AAYXX
CITATION
5VS
9M8
AAYOK
ABFSI
ADRHT
AETEA
AETIX
AGSQL
AI.
AIBXA
ALLEH
CGR
CUY
CVF
ECM
EIF
FA8
H~9
IBMZZ
ICLAB
IFJZH
NPM
RIG
RNI
RZB
VH1
XJT
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
7X8
ID FETCH-LOGICAL-c418t-f52413b822843f68ac4fa8288fdf950c1af2fd1553f2187808cdcd3de2d91a863
IEDL.DBID RIE
ISSN 0162-8828
1939-3539
IngestDate Thu Oct 02 07:37:03 EDT 2025
Mon Jun 30 02:21:48 EDT 2025
Sun Apr 06 01:21:18 EDT 2025
Wed Oct 01 02:24:12 EDT 2025
Thu Apr 24 22:50:52 EDT 2025
Wed Aug 27 02:46:14 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 9
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c418t-f52413b822843f68ac4fa8288fdf950c1af2fd1553f2187808cdcd3de2d91a863
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ORCID 0000-0002-0084-433X
0000-0001-8701-2642
0000-0002-0684-1711
PMID 37030711
PQID 2847957620
PQPubID 85458
PageCount 18
ParticipantIDs crossref_citationtrail_10_1109_TPAMI_2023_3263181
proquest_miscellaneous_2798714073
ieee_primary_10089190
pubmed_primary_37030711
crossref_primary_10_1109_TPAMI_2023_3263181
proquest_journals_2847957620
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2023-09-01
PublicationDateYYYYMMDD 2023-09-01
PublicationDate_xml – month: 09
  year: 2023
  text: 2023-09-01
  day: 01
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
– name: New York
PublicationTitle IEEE transactions on pattern analysis and machine intelligence
PublicationTitleAbbrev TPAMI
PublicationTitleAlternate IEEE Trans Pattern Anal Mach Intell
PublicationYear 2023
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref12
ref15
ref14
ref58
ref53
ref11
ref55
ref10
ref54
ref16
ref18
kubilius (ref74) 2019
radford (ref13) 2021
devlin (ref57) 2019
sutter (ref19) 2021
belghazi (ref50) 2018
ref45
ref48
ref42
ref41
alemi (ref46) 2016
ref44
ref43
nonaka (ref81) 2021; 24
romera-paredes (ref37) 2015
lan (ref59) 2020
bachman (ref47) 2019
ref49
shi (ref66) 2019
ref8
ref7
black (ref60) 2021
ref9
cheng (ref51) 2020
ref4
ref3
ref6
vaswani (ref56) 2017
palatucci (ref5) 0
ref40
wu (ref83) 2023
li (ref75) 2022
kingma (ref76) 2014
ref80
ref35
ref34
ref78
ref31
ref30
ref33
ref77
ref32
ref2
ref1
ref38
agakov (ref62) 2004; 16
van der maaten (ref79) 2008; 9
(ref84) 2023
rezende (ref64) 2014
poole (ref52) 2019
ref73
ref72
ref24
ref68
ref23
ref26
ref25
kingma (ref63) 2014
ref20
ref22
wu (ref65) 2018
chen (ref69) 2017
ref21
brown (ref82) 2020
frome (ref36) 2013
ref28
ref27
bujwid (ref17) 2021
ref29
alemi (ref67) 2018
dieng (ref71) 0
chen (ref70) 2016
zhang (ref39) 2019
ref61
References_xml – start-page: 1
  year: 2019
  ident: ref47
  article-title: Learning representations by maximizing mutual information across views
  publication-title: Proc Int Conf Neural Inf Process
– ident: ref28
  doi: 10.1109/TSP.2020.3028701
– start-page: 5171
  year: 2019
  ident: ref52
  article-title: On variational bounds of mutual information
  publication-title: Proc Annu Int Conf Mach Learn
– volume: 24
  year: 2021
  ident: ref81
  article-title: Brain hierarchy score: Which deep neural networks are hierarchically brain-like?
  publication-title: Science
– start-page: 5580
  year: 2018
  ident: ref65
  article-title: Multimodal generative models for scalable weakly-supervised learning
  publication-title: Proc Int Conf Neural Inf Process
– ident: ref80
  doi: 10.1016/j.eng.2019.03.010
– start-page: 2732
  year: 0
  ident: ref71
  article-title: Variational inference via ? upper bound minimization
  publication-title: Proc Int Conf Neural Inf Process
– start-page: 1
  year: 2016
  ident: ref46
  article-title: Deep variational information bottleneck
  publication-title: Proc Int Conf Learn Representations
– ident: ref9
  doi: 10.1109/TNNLS.2020.3028167
– ident: ref6
  doi: 10.1038/ncomms15037
– ident: ref27
  doi: 10.1016/j.inffus.2020.11.003
– ident: ref18
  doi: 10.18653/v1/2021.naacl-main.250
– start-page: 1779
  year: 2020
  ident: ref51
  article-title: CLUB: A contrastive log-ratio upper bound of mutual information
  publication-title: Proc Annu Int Conf Mach Learn
– ident: ref44
  doi: 10.18653/v1/D17-1113
– start-page: 1
  year: 2019
  ident: ref74
  article-title: Brain-like object recognition with high-performing shallow recurrent ANNs
  publication-title: Proc Adv Neural Inf Process Syst
– ident: ref40
  doi: 10.1109/ICCV.2015.474
– year: 2014
  ident: ref76
  article-title: Adam: A method for stochastic optimization
– start-page: 15692
  year: 2019
  ident: ref66
  article-title: Variational mixture-of-experts autoencoders for multi-modal deep generative models
  publication-title: Proc Int Conf Neural Inf Process
– ident: ref14
  doi: 10.1038/s41467-022-30761-2
– ident: ref32
  doi: 10.1109/ICCV.2017.386
– ident: ref58
  doi: 10.18653/v1/D19-1050
– start-page: 1
  year: 0
  ident: ref5
  article-title: Zero-shot learning with semantic output codes
  publication-title: Proc Int Conf Neural Inf Process
– year: 2023
  ident: ref83
  article-title: Visual ChatGPT: Talking, drawing and editing with visual foundation models
– ident: ref11
  doi: 10.4324/9781315798868
– ident: ref68
  doi: 10.18653/v1/K16-1002
– start-page: 8748
  year: 2021
  ident: ref13
  article-title: Learning transferable visual models from natural language supervision
  publication-title: Proc Annu Int Conf Mach Learn
– ident: ref12
  doi: 10.1016/j.tics.2021.07.006
– ident: ref15
  doi: 10.1016/j.neuroimage.2017.08.017
– ident: ref45
  doi: 10.18653/v1/S19-1013
– ident: ref72
  doi: 10.3389/fninf.2015.00023
– start-page: 2121
  year: 2013
  ident: ref36
  article-title: Devise: A deep visual-semantic embedding model
  publication-title: Proc Adv Neural Inf Process Syst
– volume: 9
  start-page: 2579
  year: 2008
  ident: ref79
  article-title: Visualizing data using t-SNE.
  publication-title: J Mach Learn Res
– ident: ref25
  doi: 10.1109/CVPR46437.2021.00384
– ident: ref24
  doi: 10.1002/hbm.22087
– start-page: 1
  year: 2021
  ident: ref19
  article-title: Generalized multimodal ELBO
  publication-title: Proc Int Conf Learn Representations
– ident: ref8
  doi: 10.1167/19.10.209a
– start-page: 38
  year: 2021
  ident: ref17
  article-title: Large-scale zero-shot image classification from rich and diverse textual descriptions
  publication-title: Proc Workshop Beyond Vis Lang Integrating Real-World Knowl
– ident: ref3
  doi: 10.1109/TPAMI.2020.2995909
– volume: 16
  start-page: 201
  year: 2004
  ident: ref62
  article-title: The IM algorithm: A variational approach to information maximization
  publication-title: Proc Adv Neural Inf Process Syst
– start-page: 1
  year: 2016
  ident: ref70
  article-title: InfoGan: Interpretable representation learning by information maximizing generative adversarial nets
  publication-title: Proc Adv Neural Inf Process Syst
– start-page: 1
  year: 2022
  ident: ref75
  article-title: BLIP: Bootstrapping language-image pre-training for unified vision-language understanding and generation
  publication-title: Proc Annu Int Conf Mach Learn
– start-page: 5998
  year: 2017
  ident: ref56
  article-title: Attention is all you need
  publication-title: Proc Adv Neural Inf Process Syst
– start-page: 159
  year: 2018
  ident: ref67
  article-title: Fixing a broken ELBO
  publication-title: Proc Annu Int Conf Mach Learn
– start-page: 4171
  year: 2019
  ident: ref57
  article-title: BERT: Pre-training of deep bidirectional transformers for language understanding
  publication-title: Proc Conf North Amer Chapter Assoc Comput Linguistics - Hum Lang Technol
– ident: ref43
  doi: 10.1162/tacl_a_00043
– ident: ref33
  doi: 10.1109/TPAMI.2018.2857768
– ident: ref54
  doi: 10.1111/j.1551-6709.2010.01106.x
– ident: ref22
  doi: 10.1126/science.1063736
– ident: ref73
  doi: 10.1371/journal.pone.0223792
– start-page: 1
  year: 2017
  ident: ref69
  article-title: Variational lossy autoencoder
  publication-title: Proc Int Conf Learn Representations
– start-page: 1278
  year: 2014
  ident: ref64
  article-title: Stochastic backpropagation and approximate inference in deep generative models
  publication-title: Proc Annu Int Conf Mach Learn
– ident: ref21
  doi: 10.1016/j.neuroimage.2022.119754
– year: 2023
  ident: ref84
  article-title: GPT-4 technical report
– start-page: 1
  year: 2014
  ident: ref63
  article-title: Auto-encoding variational Bayes
  publication-title: Proc Int Conf Learn Representations
– ident: ref10
  doi: 10.1016/j.tics.2020.08.005
– ident: ref1
  doi: 10.1038/nn1444
– ident: ref20
  doi: 10.1371/journal.pcbi.1006633
– ident: ref48
  doi: 10.1109/CVPR42600.2020.00975
– start-page: 2152
  year: 2015
  ident: ref37
  article-title: An embarrassingly simple approach to zero-shot learning
  publication-title: Proc Annu Int Conf Mach Learn
– ident: ref26
  doi: 10.1109/CVPR.2017.479
– ident: ref30
  doi: 10.1038/s41467-018-03068-4
– ident: ref2
  doi: 10.1109/TNNLS.2018.2882456
– ident: ref55
  doi: 10.1073/pnas.2105646118
– ident: ref77
  doi: 10.1016/j.neuron.2012.01.010
– start-page: 531
  year: 2018
  ident: ref50
  article-title: Mutual information neural estimation
  publication-title: Proc Annu Int Conf Mach Learn
– ident: ref78
  doi: 10.1038/s41593-021-00921-6
– ident: ref29
  doi: 10.1038/nature17637
– year: 2021
  ident: ref60
  article-title: GPT-Neo: Large scale autoregressive language modeling with mesh-tensorflow
– ident: ref61
  doi: 10.18653/v1/2022.naacl-main.235
– ident: ref38
  doi: 10.1109/CVPR.2018.00450
– ident: ref16
  doi: 10.1016/j.neuroimage.2017.06.042
– ident: ref49
  doi: 10.1109/TPAMI.2019.2909031
– ident: ref31
  doi: 10.1038/s41467-019-08848-0
– start-page: 7434
  year: 2019
  ident: ref39
  article-title: Co-representation network for generalized zero-shot learning
  publication-title: Proc Annu Int Conf Mach Learn
– ident: ref41
  doi: 10.1109/CVPR.2019.01052
– ident: ref35
  doi: 10.1109/CVPR.2019.00844
– ident: ref23
  doi: 10.1016/j.neuroimage.2009.11.064
– start-page: 1877
  year: 2020
  ident: ref82
  article-title: Language models are few-shot learners
  publication-title: Proc Adv Neural Inf Process Syst
– ident: ref7
  doi: 10.1038/nature06713
– ident: ref53
  doi: 10.1109/CVPR46437.2021.01352
– ident: ref34
  doi: 10.1109/CVPR.2016.15
– start-page: 1
  year: 2020
  ident: ref59
  article-title: ALBERT: A lite BERT for self-supervised learning of language representations
  publication-title: Proc Int Conf Learn Representations
– ident: ref4
  doi: 10.1126/science.1152876
– ident: ref42
  doi: 10.3115/v1/D14-1032
SSID ssj0014503
Score 2.672787
Snippet Decoding human visual neural representations is a challenging task with great scientific significance in revealing vision-processing mechanisms and developing...
SourceID proquest
pubmed
crossref
ieee
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 10760
SubjectTerms Algorithms
Brain
Brain - diagnostic imaging
Brain modeling
Brain-visual-linguistic embedding
Categories
Decoding
generic neural decoding
Humans
Internet
Learning
Linguistics
Model accuracy
multimodal learning
mutual information maximization
Online services
Regularization
Representations
Semantics
Training
Visual Perception
Visual stimuli
Visualization
Title Decoding Visual Neural Representations by Multimodal Learning of Brain-Visual-Linguistic Features
URI https://ieeexplore.ieee.org/document/10089190
https://www.ncbi.nlm.nih.gov/pubmed/37030711
https://www.proquest.com/docview/2847957620
https://www.proquest.com/docview/2798714073
Volume 45
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 2160-9292
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014503
  issn: 0162-8828
  databaseCode: RIE
  dateStart: 19790101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9wwEB6VPbWHLtBtWV5ypd6qhGTjJPaRp2glUFWxFbco8QMhIEFscoBfz4zjrGglKk6JZDuvmdHMZDzfB_DN6gq9eMSDCn1TgJYogkootCtuklJIbrnrSjs7z07n_Odleumb1V0vjDHGbT4zIZ26Wr5uVEe_yvYIiEaiB1uBlVxkfbPWsmTAU0eDjCEMmjjmEUOHTCT3Ln7tn_0IiSg8xGgFtZj4YRIHhRXHfzkkx7DyerDpnM7JGM6Hx-33mtyEXVuF6ukfJMc3v88qfPThJ9vv9WUN3pl6HcYDtQPzlr4OH17gFH6C8giTVHJy7M_1osP1BOmBh99uG63vXqoXrHpkrqH3rtE46qFbr1hj2QExUQT96gDz36vOAUQzikA7vMYE5ifHF4engedmCBSPRRvYlApyFYYXgic2E6XitsSvLqy2Mo1UXNqZ1URKZDGIyEUklFY60WamZVyKLPkMo7qpzQawVBuNWSFV6BRXiuizskxYyTGSKaMqnUI8CKhQHric-DNuC5fARLJw8i1IvoWX7xS-L9fc97Ad_509IeG8mNnLZQrbgyIU3rQXBflziVnaDIe_LofRKKnSUtam6XBOLgUhIebJFL70CrS8-KB3m6_cdAve07P1-9i2YdQ-dGYHA5-22nUK_wzp5fpU
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9tAEB4VeoAeoARo09KylXpDNna8dnaP9IFCSyJUJRU3y94Hqgp2ReID_PrOrNdRWilVT7a0u37NjGbGs_N9AO-tLtGLRzwo0TcFaIkiKIVCu-ImKYTklruutPEkG834l-v02jeru14YY4zbfGZCOnW1fF2rhn6VnRIQjUQPtgFPU8552rZrLYsGPHVEyBjEoJFjJtH1yETydHp1Nr4IiSo8xHgF9ZgYYhIHhhXHf7gkx7GyPtx0bud8FybdA7e7TX6GzaIM1eNfWI7__UbPYccHoOys1Zg9eGKqHux25A7M23oPnq0gFe5D8QnTVHJz7PuPeYPrCdQDD9_cRlrfv1TNWfnAXEvvXa1x1IO33rDasg_ERRG0qwPMgG8aBxHNKAZt8BoHMDv_PP04Cjw7Q6B4LBaBTakkV2KAIXhiM1Eobgv86sJqK9NIxYUdWE20RBbDiKGIhNJKJ9oMtIwLkSWHsFnVlXkJLNVGY15INTrFlSICrSwTVnKMZYqoTPsQdwLKlYcuJwaN29ylMJHMnXxzkm_u5duHk-WaXy1wxz9nH5BwVma2cunDUacIuTfueU4eXWKeNsDhd8thNEuqtRSVqRucM5SCsBCHSR9etAq0vHind6_W3PQYtkbT8WV-eTH5-hq26TnbXW1HsLm4b8wbDIMW5Vun_L8BKrv9oQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Decoding+Visual+Neural+Representations+by+Multimodal+Learning+of+Brain-Visual-Linguistic+Features&rft.jtitle=IEEE+transactions+on+pattern+analysis+and+machine+intelligence&rft.au=Du%2C+Changde&rft.au=Fu%2C+Kaicheng&rft.au=Li%2C+Jinpeng&rft.au=He%2C+Huiguang&rft.date=2023-09-01&rft.issn=1939-3539&rft.eissn=1939-3539&rft.volume=45&rft.issue=9&rft.spage=10760&rft_id=info:doi/10.1109%2FTPAMI.2023.3263181&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0162-8828&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0162-8828&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0162-8828&client=summon