Second-order encoding networks for semantic segmentation

Recently most of the state-of-the-art semantic segmentation methods have focused on context modeling for more accurate prediction. As real-world images often contain multiple objects and stuff, image features may have complex and multi-modal distributions. However, existing methods do not fully cons...

Full description

Saved in:
Bibliographic Details
Published inNeurocomputing (Amsterdam) Vol. 445; pp. 50 - 60
Main Authors Sun, Qiule, Zhang, Zhimin, Li, Peihua
Format Journal Article
LanguageEnglish
Published Elsevier B.V 20.07.2021
Subjects
Online AccessGet full text
ISSN0925-2312
1872-8286
DOI10.1016/j.neucom.2021.03.003

Cover

Abstract Recently most of the state-of-the-art semantic segmentation methods have focused on context modeling for more accurate prediction. As real-world images often contain multiple objects and stuff, image features may have complex and multi-modal distributions. However, existing methods do not fully consider sch complex distributions, having limited capability for context modeling. Towards addressing this problem, this paper proposes a second-order encoding network (SoENet) trainable end-to-end for harvesting complex contextual knowledge. At the core of SoENet is an encoding module which can capture second-order statistics in individual feature subspaces. Specifically, we divide the entire feature space into a set of subspaces (clusters) represented by codewords, in each of which a covariance matrix is computed for second-order statistical modeling. The covariance matrices of all subspaces are concatenated to form a 3D tensor, which is then subject to convolutions and nonlinear activations and finally used for scaling of input features. In this way, we can encode the context which involves the complex distribution into learning process in an end-to-end manner. The proposed SoENet is evaluated on four commonly used challenging benchmarks, i.e., PASCAL Context, PASCAL VOC 2012, ADE20K and Cityscapes. The experiments show that our network significantly outperforms its counterparts and is competitive compared to state-of-the-art methods.
AbstractList Recently most of the state-of-the-art semantic segmentation methods have focused on context modeling for more accurate prediction. As real-world images often contain multiple objects and stuff, image features may have complex and multi-modal distributions. However, existing methods do not fully consider sch complex distributions, having limited capability for context modeling. Towards addressing this problem, this paper proposes a second-order encoding network (SoENet) trainable end-to-end for harvesting complex contextual knowledge. At the core of SoENet is an encoding module which can capture second-order statistics in individual feature subspaces. Specifically, we divide the entire feature space into a set of subspaces (clusters) represented by codewords, in each of which a covariance matrix is computed for second-order statistical modeling. The covariance matrices of all subspaces are concatenated to form a 3D tensor, which is then subject to convolutions and nonlinear activations and finally used for scaling of input features. In this way, we can encode the context which involves the complex distribution into learning process in an end-to-end manner. The proposed SoENet is evaluated on four commonly used challenging benchmarks, i.e., PASCAL Context, PASCAL VOC 2012, ADE20K and Cityscapes. The experiments show that our network significantly outperforms its counterparts and is competitive compared to state-of-the-art methods.
Author Sun, Qiule
Zhang, Zhimin
Li, Peihua
Author_xml – sequence: 1
  givenname: Qiule
  surname: Sun
  fullname: Sun, Qiule
– sequence: 2
  givenname: Zhimin
  surname: Zhang
  fullname: Zhang, Zhimin
– sequence: 3
  givenname: Peihua
  surname: Li
  fullname: Li, Peihua
  email: peihuali@dlut.edu.cn
BookMark eNqFkMtOwzAQRS1UJNrCH7DIDySM7cRJWCChipdUiQXdW1NnUrk0NrINiL8nVVixgNXczbm6cxZs5rwjxi45FBy4utoXjt6NHwoBghcgCwB5wua8qUXeiEbN2BxaUeVCcnHGFjHuAXjNRTtnzQsZ77rch45CRs74zrpd5ih9-vAas96HLNKALlkzht1ALmGy3p2z0x4PkS5-7pJt7u82q8d8_fzwtLpd50aCSrmplZF9V1VbwrpG4Mh5C21TKlSqBQWi5LVAEmWvoFK4RYk9N61qUG2rSi7Z9VRrgo8xUK-NnQakgPagOeijAr3XkwJ9VKBB6lHBCJe_4LdgBwxf_2E3E0bjXx-Wgo7Gjmqos4FM0p23fxd8A8Jyejc
CitedBy_id crossref_primary_10_1016_j_neucom_2023_03_006
crossref_primary_10_1016_j_eswa_2024_125465
crossref_primary_10_1007_s00521_023_08800_w
crossref_primary_10_1109_ACCESS_2025_3529812
crossref_primary_10_1016_j_dcan_2023_05_010
crossref_primary_10_1002_nbm_4657
crossref_primary_10_17798_bitlisfen_1473041
crossref_primary_10_1007_s40747_023_01103_6
crossref_primary_10_1109_TPAMI_2022_3216339
crossref_primary_10_1007_s11063_023_11270_9
crossref_primary_10_1016_j_neucom_2021_11_056
crossref_primary_10_1029_2023JH000109
crossref_primary_10_1109_TGRS_2024_3373493
crossref_primary_10_3390_rs16173178
Cites_doi 10.1109/TPAMI.2017.2699184
10.1007/978-3-030-01219-9_37
10.1109/CVPR.2018.00199
10.1109/CVPR.2014.119
10.1007/s11432-019-2685-1
10.1109/CVPR.2019.00314
10.1007/978-3-030-58520-4_26
10.1109/CVPR.2019.00545
10.1109/CVPR.2018.00813
10.1109/CVPR.2017.195
10.1109/CVPR42600.2020.01155
10.1109/ICIP.2019.8803154
10.1109/CVPR.2018.00747
10.1109/TPAMI.2017.2711011
10.1109/ICCV.2015.170
10.1007/978-3-030-01228-1_26
10.1007/978-3-030-01234-2_1
10.1007/s11280-018-0556-3
10.1016/j.neucom.2017.12.020
10.1109/CVPR.2017.309
10.1016/S0047-259X(03)00096-4
10.1007/978-3-319-10602-1_48
10.1109/CVPR.2019.00324
10.1109/ICCV.2015.339
10.1109/CVPR.2019.00053
10.1109/CVPR42600.2020.01308
10.1109/CVPR.2016.480
10.1109/ICCV.2019.00386
10.1109/CVPR.2017.544
10.1109/TNNLS.2018.2874657
10.1109/CVPR.2018.00105
10.1109/TCSVT.2018.2808685
10.1109/CVPR.2016.350
10.1109/CVPR.2018.00745
10.1109/TIP.2019.2962685
10.1007/978-3-030-58520-4_1
10.1109/CVPR42600.2020.01078
10.1109/CVPR.2017.660
10.1109/CVPR42600.2020.00897
10.1109/ICCV.2017.228
10.1016/j.neucom.2019.12.042
10.1109/ICCV.2019.00195
10.1109/TPAMI.2009.132
10.1109/CVPR.2009.5206848
10.1109/CVPR.2018.00254
10.1109/CVPR.2017.518
10.1109/CVPR.2019.00326
10.1109/CVPR.2015.7298965
10.1109/CVPR.2019.00417
10.1109/CVPR.2017.549
10.1109/CVPR.2010.5540039
10.1109/CVPR.2019.00052
10.1109/CVPR.2019.00522
10.1109/CVPR.2019.00767
10.1109/ICCV.2019.00068
10.1007/s11263-009-0275-4
ContentType Journal Article
Copyright 2021 Elsevier B.V.
Copyright_xml – notice: 2021 Elsevier B.V.
DBID AAYXX
CITATION
DOI 10.1016/j.neucom.2021.03.003
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1872-8286
EndPage 60
ExternalDocumentID 10_1016_j_neucom_2021_03_003
S0925231221003532
GroupedDBID ---
--K
--M
.DC
.~1
0R~
123
1B1
1~.
1~5
4.4
457
4G.
53G
5VS
7-5
71M
8P~
9JM
9JN
AABNK
AACTN
AADPK
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAXLA
AAXUO
AAYFN
ABBOA
ABCQJ
ABFNM
ABJNI
ABMAC
ABYKQ
ACDAQ
ACGFS
ACRLP
ACZNC
ADBBV
ADEZE
AEBSH
AEKER
AENEX
AFKWA
AFTJW
AFXIZ
AGHFR
AGUBO
AGWIK
AGYEJ
AHHHB
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
AXJTR
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EO8
EO9
EP2
EP3
F5P
FDB
FIRID
FNPLU
FYGXN
G-Q
GBLVA
GBOLZ
IHE
J1W
KOM
LG9
M41
MO0
MOBAO
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
ROL
RPZ
SDF
SDG
SDP
SES
SPC
SPCBC
SSN
SSV
SSZ
T5K
ZMT
~G-
29N
AAQXK
AATTM
AAXKI
AAYWO
AAYXX
ABWVN
ABXDB
ACNNM
ACRPL
ACVFH
ADCNI
ADJOM
ADMUD
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGCQF
AGQPQ
AGRNS
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
ASPBG
AVWKF
AZFZN
BNPGV
CITATION
EJD
FEDTE
FGOYB
HLZ
HVGLF
HZ~
R2-
RIG
SBC
SEW
SSH
WUQ
XPP
ID FETCH-LOGICAL-c306t-c76c3fd55bea77a01a11909846a66906024172ae24f6056aba3af1c968a6b553
IEDL.DBID AIKHN
ISSN 0925-2312
IngestDate Tue Jul 01 01:46:58 EDT 2025
Thu Apr 24 23:11:57 EDT 2025
Fri Feb 23 02:46:25 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Multi-modal distributions
Covariance pooling
Semantic segmentation
Context modeling
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c306t-c76c3fd55bea77a01a11909846a66906024172ae24f6056aba3af1c968a6b553
PageCount 11
ParticipantIDs crossref_citationtrail_10_1016_j_neucom_2021_03_003
crossref_primary_10_1016_j_neucom_2021_03_003
elsevier_sciencedirect_doi_10_1016_j_neucom_2021_03_003
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2021-07-20
PublicationDateYYYYMMDD 2021-07-20
PublicationDate_xml – month: 07
  year: 2021
  text: 2021-07-20
  day: 20
PublicationDecade 2020
PublicationTitle Neurocomputing (Amsterdam)
PublicationYear 2021
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References H. Shi, H. Li, Q. Wu, Z. Song, Scene parsing via integrated classification model and variance-based regularization, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 5307–5316.
Wu, Chen, Jing, Hu, Ge, Ji (b0310) 2020; 384
X. Pan, X. Zhan, J. Shi, X. Tang, P. Luo, Switchable whitening for deep representation learning, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2019, pp. 1863–1871.
Y. Chen, M. Rohrbach, Z. Yan, Y. Shuicheng, J. Feng, Y. Kalantidis, Graph-based global reasoning networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 433–442.
R. Mottaghi, X. Chen, X. Liu, N.-G. Cho, S.-W. Lee, S. Fidler, R. Urtasun, A. Yuille, The role of context for object detection and semantic segmentation in the wild, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2014, pp. 891–898.
L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A.L. Yuille, Semantic image segmentation with deep convolutional nets and fully connected crfs, in: Proc. Int. Conf. Learn. Represent. (ICLR), 2015.
T.-Y. Lin, A. RoyChowdhury, S. Maji, Bilinear cnn models for fine-grained visual recognition, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2015, pp. 1449–1457.
B. Bryan, Y. Gong, Y. Zhang, C. Poellabauer, Second-order non-local attention networks for person re-identification, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2019, pp. 3760–3769.
A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, in: Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2012, pp. 1097–1105.
J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, F.F. Li, Imagenet: A large-scale hierarchical image database, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2009, pp. 248–255.
C. Yu, J. Wang, C. Peng, C. Gao, G. Yu, N. Sang, Learning a discriminative feature network for semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 1857–1866.
P. Li, J. Xie, Q. Wang, W. Zuo, Is second-order information helpful for large-scale visual recognition?, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2017, pp. 2070–2078.
M. Amirul Islam, M. Rochan, N.D.B. Bruce, Y. Wang, Gated feedback refinement network for dense image labeling, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 3751–3759.
H. Zhang, J. Xue, K. Dana, Deep ten: Texture encoding network, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 2896–2905.
Ledoit, Wolf (b0200) 2004; 88
H. Ding, X. Jiang, B. Shuai, A. Qun Liu, G. Wang, Context contrasted feature and gated multi-scale aggregation for scene segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 2393–2402.
H. Hu, D. Ji, W. Gan, S. Bai, W. Wu, e. A. Yan, Junjie, H. Bischof, T. Brox, J.-M. Frahm, Class-wise dynamic graph convolution for semantic segmentation, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2020, pp. 435–452.
Z. Zhu, M. Xu, S. Bai, T. Huang, X. Bai, Asymmetric non-local neural networks for semantic segmentation, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2019, pp. 593–602.
K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, in: Proc. Int. Conf. Learn. Represent. (ICLR), 2015.
X. Li, Y. Yang, Q. Zhao, T. Shen, Z. Lin, L. H, Spatial pyramid based graph reasoning for semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 8950–8959.
B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso, A. Torralba, Scene parsing through ade20k dataset, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 5122–5130.
F. Yu, V. Koltun, Multi-scale context aggregation by dilated convolutions, in: Proc. Int. Conf. Learn. Represent. (ICLR), 2016.
P. Li, J. Xie, Q. Wang, Z. Gao, Towards faster training of global covariance pooling networks by iterative matrix square root normalization, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 947–955.
M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, B. Schiele, The cityscapes dataset for semantic urban scene understanding, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2016, pp. 3213–3223.
Ding, Jiang, Shuai, Liu, Wang (b0280) 2020; 29
S. Woo, J. Park, J.-Y. Lee, I. So Kweon, Cbam: Convolutional block attention module, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 3–19.
Arandjelović, Gronat, Torii, Pajdla, Sivic (b0080) 2018; 40
Nets: Double attention networks, in: Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2018, pp. 352–361.
W. Liu, A. Rabinovich, A.C. Berg, Parsenet: Looking wider to see better, in: Proc. Int. Conf. Learn. Represent. (ICLR), 2016.
X. Wang, R. Girshick, A. Gupta, K. He, Non-local neural networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 7794–7803.
Zhou, Yang, Gao, Ou, Lu, Chen, Latecki (b0290) 2019; 22
van der Maaten, Hinton (b0035) 2008; 9
Y. Wang, Q. Zhou, J. Liu, J. Xiong, G. Gao, X. Wu, L.J. Latecki, Lednet: A lightweight encoder-decoder network for real-time semantic segmentation, in: Proc. IEEE Int. Conf. Image Process. (ICIP), IEEE, 2019, pp. 1860–1864.
T.-W. Ke, J.-J. Hwang, Z. Liu, S. Yu, Adaptive affinity fields for semantic segmentation, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 587–602.
K.M. He, X.Y. Zhang, Q.S. Ren, J. Sun, Deep residual learning for image recognition, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2016, pp. 770–778.
J. Hu, L. Shen, G. Sun, Squeeze-and-excitation networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 7132–7141.
H. Zhao, J. Shi, X. Qi, X. Wang, J. Jia, Pyramid scene parsing network, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 6230–6239.
Chen, Papandreou, Kokkinos, Murphy, Yuille (b0050) 2018; 40
Higham (b0195) 2008
van Gemert, Veenman, Smeulders, Geusebroek (b0190) 2010; 32
O.V. Catalin Ionescu, C. Sminchisescu, Matrix backpropagation for deep networks with structured layers, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2015, pp. 2965–2973.
T. Xiao, Y. Liu, B. Zhou, Y. Jiang, J. Sun, Unified perceptual parsing for scene understanding, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 432–448.
D. Lin, D. Shen, S. Shen, Y. Ji, D. Lischinski, D. Cohen-Or, H. Huang, Zigzagnet: Fusing top-down and bottom-up context for object segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 7490–7499.
Q. Wang, B. Wu, P. Zhu, P. Li, W. Zuo, Q. Hu, Eca-net: Efficient channel attention for deep convolutional neural networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 11534–11542.
Z. Zhong, Z.Q. Lin, R. Bidart, X. Hu, I.B. Daya, Z. Li, W. Zheng, J. Li, A. Wong, Squeeze-and-attention networks for semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 13065–13074.
Q. Wang, P. Li, Q. Hu, P. Zhu, W. Zuo, Deep global generalized gaussian networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 5080–5088.
J. Fu, J. Liu, H. Tian, Y. Li, Y. Bao, Z. Fang, H. Lu, Dual attention network for scene segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 3146–3154.
J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2015, pp. 3431–3440.
Zhou, Wang, Liu, Jin, Latecki (b0125) 2019; 62
Q. Wang, L. Zhang, B. Wu, D. Ren, P. Li, W. Zuo, Q. Hu, What deep cnns benefit from global covariance pooling: An optimization perspective, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 10771–10780.
Z. Tian, T. He, C. Shen, Y. Yan, Decoders matter for semantic segmentation: Data-dependent decoding enables flexible feature aggregation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 3126–3135.
Peng, Zhao, Zhang (b0185) 2019; 29
D. Lin, Y. Ji, D. Lischinski, D. Cohen-Or, H. Huang, Multi-scale context intertwining for semantic segmentation, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 322–638.
Everingham, Van Gool, Williams, Winn, Zisserman (b0215) 2010; 88
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, Pytorch: An imperative style, high-performance deep learning library, in: Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2019, pp. 8026–8037.
X. Li, X. Li, L. Zhang, G. Cheng, J. Shi, Z. Lin, S. Tan, e. A. Tong, Yunhai, H. Bischof, T. Brox, J.-M. Frahm, Improving semantic segmentation via decoupled body and edge supervision, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2020, pp. 435–452.
Tang, Wang, Shi, Bai, Liu, Tu (b0135) 2018; 30
S. Zhao, Y. Wang, Z. Yang, D. Cai, Region mutual information loss for semantic segmentation, in: Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2019, pp. 11117–11127.
H. Zhang, K. Dana, J. Shi, Z. Zhang, X. Wang, A. Tyagi, A. Agrawal, Context encoding for semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 7151–7160.
Sun, Wang, Zhang, Li (b0095) 2018; 282
Y. Chen, Y. Kalantidis, J. Li, S. Yan, J. Feng
F. Chollet, Xception: Deep learning with depthwise separable convolutions, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 1800–1807.
W. Shao, T. Meng, J. Li, R. Zhang, Y. Li, X. Wang, P. Luo, Ssn: Learning sparse switchable normalization via sparsestmax, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 443–451.
Y. Zhou, X. Sun, Z.-J. Zha, W. Zeng, Context-reinforced semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 4046–4055.
T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C.L. Zitnick, Microsoft coco: Common objects in context, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2014, pp. 740–755.
H. Jégou, M. Douze, C. Schmid, P. Pérez, Aggregating local descriptors into a compact image representation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2010, pp. 3304–3311.
G. Lin, A. Milan, C. Shen, I.D. Reid, Refinene
10.1016/j.neucom.2021.03.003_b0235
10.1016/j.neucom.2021.03.003_b0115
10.1016/j.neucom.2021.03.003_b0110
10.1016/j.neucom.2021.03.003_b0275
10.1016/j.neucom.2021.03.003_b0230
10.1016/j.neucom.2021.03.003_b0155
Wu (10.1016/j.neucom.2021.03.003_b0310) 2020; 384
Tang (10.1016/j.neucom.2021.03.003_b0135) 2018; 30
10.1016/j.neucom.2021.03.003_b0315
Zhou (10.1016/j.neucom.2021.03.003_b0290) 2019; 22
10.1016/j.neucom.2021.03.003_b0150
10.1016/j.neucom.2021.03.003_b0270
10.1016/j.neucom.2021.03.003_b0075
10.1016/j.neucom.2021.03.003_b0030
10.1016/j.neucom.2021.03.003_b0070
10.1016/j.neucom.2021.03.003_b0025
van Gemert (10.1016/j.neucom.2021.03.003_b0190) 2010; 32
10.1016/j.neucom.2021.03.003_b0300
10.1016/j.neucom.2021.03.003_b0105
10.1016/j.neucom.2021.03.003_b0225
10.1016/j.neucom.2021.03.003_b0220
10.1016/j.neucom.2021.03.003_b0065
10.1016/j.neucom.2021.03.003_b0340
10.1016/j.neucom.2021.03.003_b0145
10.1016/j.neucom.2021.03.003_b0100
10.1016/j.neucom.2021.03.003_b0265
10.1016/j.neucom.2021.03.003_b0305
10.1016/j.neucom.2021.03.003_b0260
10.1016/j.neucom.2021.03.003_b0020
10.1016/j.neucom.2021.03.003_b0140
van der Maaten (10.1016/j.neucom.2021.03.003_b0035) 2008; 9
10.1016/j.neucom.2021.03.003_b0060
10.1016/j.neucom.2021.03.003_b0180
10.1016/j.neucom.2021.03.003_b0015
Arandjelović (10.1016/j.neucom.2021.03.003_b0080) 2018; 40
10.1016/j.neucom.2021.03.003_b0335
10.1016/j.neucom.2021.03.003_b0055
10.1016/j.neucom.2021.03.003_b0330
10.1016/j.neucom.2021.03.003_b0010
10.1016/j.neucom.2021.03.003_b0175
10.1016/j.neucom.2021.03.003_b0255
10.1016/j.neucom.2021.03.003_b0210
Sun (10.1016/j.neucom.2021.03.003_b0095) 2018; 282
Higham (10.1016/j.neucom.2021.03.003_b0195) 2008
Ledoit (10.1016/j.neucom.2021.03.003_b0200) 2004; 88
Everingham (10.1016/j.neucom.2021.03.003_b0215) 2010; 88
Chen (10.1016/j.neucom.2021.03.003_b0050) 2018; 40
10.1016/j.neucom.2021.03.003_b0130
10.1016/j.neucom.2021.03.003_b0295
10.1016/j.neucom.2021.03.003_b0250
Peng (10.1016/j.neucom.2021.03.003_b0185) 2019; 29
10.1016/j.neucom.2021.03.003_b0090
10.1016/j.neucom.2021.03.003_b0170
10.1016/j.neucom.2021.03.003_b0245
10.1016/j.neucom.2021.03.003_b0325
10.1016/j.neucom.2021.03.003_b0005
10.1016/j.neucom.2021.03.003_b0165
10.1016/j.neucom.2021.03.003_b0120
10.1016/j.neucom.2021.03.003_b0285
10.1016/j.neucom.2021.03.003_b0045
10.1016/j.neucom.2021.03.003_b0320
10.1016/j.neucom.2021.03.003_b0205
Zhou (10.1016/j.neucom.2021.03.003_b0125) 2019; 62
Ding (10.1016/j.neucom.2021.03.003_b0280) 2020; 29
10.1016/j.neucom.2021.03.003_b0040
10.1016/j.neucom.2021.03.003_b0160
10.1016/j.neucom.2021.03.003_b0240
10.1016/j.neucom.2021.03.003_b0085
References_xml – volume: 29
  start-page: 773
  year: 2019
  end-page: 786
  ident: b0185
  article-title: Two-stream collaborative learning with spatial-temporal attention for video classification
  publication-title: IEEE Trans. Circuit Syst. Video Technol.
– reference: G. Lin, A. Milan, C. Shen, I.D. Reid, Refinenet: Multi-path refinement networks for high-resolution semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 5168–5177.
– reference: X. Li, X. Li, L. Zhang, G. Cheng, J. Shi, Z. Lin, S. Tan, e. A. Tong, Yunhai, H. Bischof, T. Brox, J.-M. Frahm, Improving semantic segmentation via decoupled body and edge supervision, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2020, pp. 435–452.
– reference: W. Liu, A. Rabinovich, A.C. Berg, Parsenet: Looking wider to see better, in: Proc. Int. Conf. Learn. Represent. (ICLR), 2016.
– reference: Q. Wang, P. Li, Q. Hu, P. Zhu, W. Zuo, Deep global generalized gaussian networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 5080–5088.
– reference: H. Shi, H. Li, Q. Wu, Z. Song, Scene parsing via integrated classification model and variance-based regularization, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 5307–5316.
– volume: 40
  start-page: 1437
  year: 2018
  end-page: 1451
  ident: b0080
  article-title: Netvlad: Cnn architecture for weakly supervised place recognition
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
– reference: Y. Zhou, X. Sun, Z.-J. Zha, W. Zeng, Context-reinforced semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 4046–4055.
– reference: J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, F.F. Li, Imagenet: A large-scale hierarchical image database, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2009, pp. 248–255.
– reference: R. Mottaghi, X. Chen, X. Liu, N.-G. Cho, S.-W. Lee, S. Fidler, R. Urtasun, A. Yuille, The role of context for object detection and semantic segmentation in the wild, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2014, pp. 891–898.
– reference: L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A.L. Yuille, Semantic image segmentation with deep convolutional nets and fully connected crfs, in: Proc. Int. Conf. Learn. Represent. (ICLR), 2015.
– reference: Q. Wang, P. Li, W. Zuo, L. Zhang, Raid-g: Robust estimation of approximate infinite dimensional gaussian with application to material recognition, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2016, pp. 4433–4441.
– reference: S. Zhao, Y. Wang, Z. Yang, D. Cai, Region mutual information loss for semantic segmentation, in: Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2019, pp. 11117–11127.
– reference: K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, in: Proc. Int. Conf. Learn. Represent. (ICLR), 2015.
– reference: X. Wang, R. Girshick, A. Gupta, K. He, Non-local neural networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 7794–7803.
– reference: B. Bryan, Y. Gong, Y. Zhang, C. Poellabauer, Second-order non-local attention networks for person re-identification, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2019, pp. 3760–3769.
– reference: H. Zhao, J. Shi, X. Qi, X. Wang, J. Jia, Pyramid scene parsing network, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 6230–6239.
– reference: A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, in: Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2012, pp. 1097–1105.
– volume: 9
  start-page: 2579
  year: 2008
  end-page: 2605
  ident: b0035
  article-title: Visualizing data using t-sne
  publication-title: J. Mach. Learn. Res.
– reference: Z. Zhu, M. Xu, S. Bai, T. Huang, X. Bai, Asymmetric non-local neural networks for semantic segmentation, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2019, pp. 593–602.
– reference: H. Zhang, J. Xue, K. Dana, Deep ten: Texture encoding network, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 2896–2905.
– reference: F. Yu, V. Koltun, Multi-scale context aggregation by dilated convolutions, in: Proc. Int. Conf. Learn. Represent. (ICLR), 2016.
– reference: X. Li, Y. Yang, Q. Zhao, T. Shen, Z. Lin, L. H, Spatial pyramid based graph reasoning for semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 8950–8959.
– reference: P. Li, J. Xie, Q. Wang, W. Zuo, Is second-order information helpful for large-scale visual recognition?, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2017, pp. 2070–2078.
– reference: H. Ding, X. Jiang, B. Shuai, A. Qun Liu, G. Wang, Context contrasted feature and gated multi-scale aggregation for scene segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 2393–2402.
– reference: D. Lin, Y. Ji, D. Lischinski, D. Cohen-Or, H. Huang, Multi-scale context intertwining for semantic segmentation, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 322–638.
– volume: 88
  start-page: 303
  year: 2010
  end-page: 338
  ident: b0215
  article-title: The pascal visual object classes (voc) challenge
  publication-title: Int. J. Comput. Vis.
– volume: 40
  start-page: 834
  year: 2018
  end-page: 848
  ident: b0050
  article-title: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
– reference: J. Fu, J. Liu, H. Tian, Y. Li, Y. Bao, Z. Fang, H. Lu, Dual attention network for scene segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 3146–3154.
– reference: Y. Chen, M. Rohrbach, Z. Yan, Y. Shuicheng, J. Feng, Y. Kalantidis, Graph-based global reasoning networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 433–442.
– reference: Z. Zhong, Z.Q. Lin, R. Bidart, X. Hu, I.B. Daya, Z. Li, W. Zheng, J. Li, A. Wong, Squeeze-and-attention networks for semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 13065–13074.
– reference: D. Lin, D. Shen, S. Shen, Y. Ji, D. Lischinski, D. Cohen-Or, H. Huang, Zigzagnet: Fusing top-down and bottom-up context for object segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 7490–7499.
– volume: 22
  start-page: 555
  year: 2019
  end-page: 570
  ident: b0290
  article-title: Multi-scale deep context convolutional neural networks for semantic segmentation
  publication-title: World Wide Web
– volume: 88
  start-page: 365
  year: 2004
  end-page: 411
  ident: b0200
  article-title: A well-conditioned estimator for large-dimensional covariance matrices
  publication-title: J. Multivar. Anal.
– volume: 32
  start-page: 1271
  year: 2010
  end-page: 1283
  ident: b0190
  article-title: Visual word ambiguity
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
– reference: B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso, A. Torralba, Scene parsing through ade20k dataset, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 5122–5130.
– reference: -Nets: Double attention networks, in: Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2018, pp. 352–361.
– reference: T.-Y. Lin, A. RoyChowdhury, S. Maji, Bilinear cnn models for fine-grained visual recognition, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2015, pp. 1449–1457.
– reference: Q. Wang, B. Wu, P. Zhu, P. Li, W. Zuo, Q. Hu, Eca-net: Efficient channel attention for deep convolutional neural networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 11534–11542.
– reference: X. Pan, X. Zhan, J. Shi, X. Tang, P. Luo, Switchable whitening for deep representation learning, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2019, pp. 1863–1871.
– volume: 282
  start-page: 174
  year: 2018
  end-page: 183
  ident: b0095
  article-title: Hyperlayer bilinear pooling with application to fine-grained categorization and image retrieval
  publication-title: Neurocomputing
– reference: O.V. Catalin Ionescu, C. Sminchisescu, Matrix backpropagation for deep networks with structured layers, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2015, pp. 2965–2973.
– reference: T. Xiao, Y. Liu, B. Zhou, Y. Jiang, J. Sun, Unified perceptual parsing for scene understanding, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 432–448.
– volume: 29
  start-page: 3520
  year: 2020
  end-page: 3533
  ident: b0280
  article-title: Semantic segmentation with context encoding and multi-path decoding
  publication-title: IEEE Trans. Image Process.
– reference: A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, Pytorch: An imperative style, high-performance deep learning library, in: Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2019, pp. 8026–8037.
– reference: S. Woo, J. Park, J.-Y. Lee, I. So Kweon, Cbam: Convolutional block attention module, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 3–19.
– reference: Q. Wang, L. Zhang, B. Wu, D. Ren, P. Li, W. Zuo, Q. Hu, What deep cnns benefit from global covariance pooling: An optimization perspective, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 10771–10780.
– reference: J. Hu, L. Shen, G. Sun, Squeeze-and-excitation networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 7132–7141.
– reference: W. Shao, T. Meng, J. Li, R. Zhang, Y. Li, X. Wang, P. Luo, Ssn: Learning sparse switchable normalization via sparsestmax, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 443–451.
– reference: Y. Wang, Q. Zhou, J. Liu, J. Xiong, G. Gao, X. Wu, L.J. Latecki, Lednet: A lightweight encoder-decoder network for real-time semantic segmentation, in: Proc. IEEE Int. Conf. Image Process. (ICIP), IEEE, 2019, pp. 1860–1864.
– reference: K.M. He, X.Y. Zhang, Q.S. Ren, J. Sun, Deep residual learning for image recognition, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2016, pp. 770–778.
– reference: M. Amirul Islam, M. Rochan, N.D.B. Bruce, Y. Wang, Gated feedback refinement network for dense image labeling, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 3751–3759.
– reference: H. Jégou, M. Douze, C. Schmid, P. Pérez, Aggregating local descriptors into a compact image representation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2010, pp. 3304–3311.
– reference: H. Zhang, K. Dana, J. Shi, Z. Zhang, X. Wang, A. Tyagi, A. Agrawal, Context encoding for semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 7151–7160.
– reference: C. Yu, J. Wang, C. Peng, C. Gao, G. Yu, N. Sang, Learning a discriminative feature network for semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 1857–1866.
– reference: Z. Tian, T. He, C. Shen, Y. Yan, Decoders matter for semantic segmentation: Data-dependent decoding enables flexible feature aggregation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 3126–3135.
– reference: T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C.L. Zitnick, Microsoft coco: Common objects in context, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2014, pp. 740–755.
– volume: 384
  start-page: 182
  year: 2020
  end-page: 191
  ident: b0310
  article-title: Dynamic attention network for semantic segmentation
  publication-title: Neurocomputing
– reference: M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, B. Schiele, The cityscapes dataset for semantic urban scene understanding, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2016, pp. 3213–3223.
– reference: P. Li, J. Xie, Q. Wang, Z. Gao, Towards faster training of global covariance pooling networks by iterative matrix square root normalization, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 947–955.
– reference: T.-W. Ke, J.-J. Hwang, Z. Liu, S. Yu, Adaptive affinity fields for semantic segmentation, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 587–602.
– reference: F. Chollet, Xception: Deep learning with depthwise separable convolutions, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 1800–1807.
– reference: H. Hu, D. Ji, W. Gan, S. Bai, W. Wu, e. A. Yan, Junjie, H. Bischof, T. Brox, J.-M. Frahm, Class-wise dynamic graph convolution for semantic segmentation, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2020, pp. 435–452.
– volume: 62
  start-page: 227101
  year: 2019
  end-page: 227102
  ident: b0125
  article-title: An open-source project for real-time image semantic segmentation
  publication-title: Sci. China-Inf. Sci.
– reference: Z. Gao, J. Xie, Q. Wang, P. Li, Global second-order pooling convolutional networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 3024–3033.
– reference: Y. Chen, Y. Kalantidis, J. Li, S. Yan, J. Feng,
– reference: J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2015, pp. 3431–3440.
– volume: 30
  start-page: 2244
  year: 2018
  end-page: 2250
  ident: b0135
  article-title: Deep fishernet for image classification
  publication-title: IEEE Trans. Neural Netw. Learn. Syst.
– year: 2008
  ident: b0195
  article-title: Functions of Matrices: Theory and Computation
– volume: 40
  start-page: 834
  issue: 4
  year: 2018
  ident: 10.1016/j.neucom.2021.03.003_b0050
  article-title: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
  doi: 10.1109/TPAMI.2017.2699184
– ident: 10.1016/j.neucom.2021.03.003_b0245
  doi: 10.1007/978-3-030-01219-9_37
– ident: 10.1016/j.neucom.2021.03.003_b0060
  doi: 10.1109/CVPR.2018.00199
– ident: 10.1016/j.neucom.2021.03.003_b0210
  doi: 10.1109/CVPR.2014.119
– volume: 62
  start-page: 227101
  issue: 12
  year: 2019
  ident: 10.1016/j.neucom.2021.03.003_b0125
  article-title: An open-source project for real-time image semantic segmentation
  publication-title: Sci. China-Inf. Sci.
  doi: 10.1007/s11432-019-2685-1
– ident: 10.1016/j.neucom.2021.03.003_b0150
  doi: 10.1109/CVPR.2019.00314
– ident: 10.1016/j.neucom.2021.03.003_b0335
  doi: 10.1007/978-3-030-58520-4_26
– ident: 10.1016/j.neucom.2021.03.003_b0265
  doi: 10.1109/CVPR.2019.00545
– ident: 10.1016/j.neucom.2021.03.003_b0075
  doi: 10.1109/CVPR.2018.00813
– ident: 10.1016/j.neucom.2021.03.003_b0285
  doi: 10.1109/CVPR.2017.195
– ident: 10.1016/j.neucom.2021.03.003_b0180
  doi: 10.1109/CVPR42600.2020.01155
– ident: 10.1016/j.neucom.2021.03.003_b0130
  doi: 10.1109/ICIP.2019.8803154
– ident: 10.1016/j.neucom.2021.03.003_b0055
  doi: 10.1109/CVPR.2018.00747
– volume: 40
  start-page: 1437
  issue: 6
  year: 2018
  ident: 10.1016/j.neucom.2021.03.003_b0080
  article-title: Netvlad: Cnn architecture for weakly supervised place recognition
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
  doi: 10.1109/TPAMI.2017.2711011
– ident: 10.1016/j.neucom.2021.03.003_b0160
– ident: 10.1016/j.neucom.2021.03.003_b0145
  doi: 10.1109/ICCV.2015.170
– ident: 10.1016/j.neucom.2021.03.003_b0315
  doi: 10.1007/978-3-030-01228-1_26
– ident: 10.1016/j.neucom.2021.03.003_b0175
  doi: 10.1007/978-3-030-01234-2_1
– volume: 22
  start-page: 555
  issue: 2
  year: 2019
  ident: 10.1016/j.neucom.2021.03.003_b0290
  article-title: Multi-scale deep context convolutional neural networks for semantic segmentation
  publication-title: World Wide Web
  doi: 10.1007/s11280-018-0556-3
– ident: 10.1016/j.neucom.2021.03.003_b0305
– ident: 10.1016/j.neucom.2021.03.003_b0005
– volume: 282
  start-page: 174
  year: 2018
  ident: 10.1016/j.neucom.2021.03.003_b0095
  article-title: Hyperlayer bilinear pooling with application to fine-grained categorization and image retrieval
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2017.12.020
– volume: 9
  start-page: 2579
  issue: 86
  year: 2008
  ident: 10.1016/j.neucom.2021.03.003_b0035
  article-title: Visualizing data using t-sne
  publication-title: J. Mach. Learn. Res.
– ident: 10.1016/j.neucom.2021.03.003_b0090
  doi: 10.1109/CVPR.2017.309
– ident: 10.1016/j.neucom.2021.03.003_b0230
– ident: 10.1016/j.neucom.2021.03.003_b0040
– volume: 88
  start-page: 365
  issue: 2
  year: 2004
  ident: 10.1016/j.neucom.2021.03.003_b0200
  article-title: A well-conditioned estimator for large-dimensional covariance matrices
  publication-title: J. Multivar. Anal.
  doi: 10.1016/S0047-259X(03)00096-4
– ident: 10.1016/j.neucom.2021.03.003_b0275
  doi: 10.1007/978-3-319-10602-1_48
– ident: 10.1016/j.neucom.2021.03.003_b0260
  doi: 10.1109/CVPR.2019.00324
– ident: 10.1016/j.neucom.2021.03.003_b0140
  doi: 10.1109/ICCV.2015.339
– ident: 10.1016/j.neucom.2021.03.003_b0320
  doi: 10.1109/CVPR.2019.00053
– ident: 10.1016/j.neucom.2021.03.003_b0115
  doi: 10.1109/CVPR42600.2020.01308
– ident: 10.1016/j.neucom.2021.03.003_b0300
– ident: 10.1016/j.neucom.2021.03.003_b0205
  doi: 10.1109/CVPR.2016.480
– ident: 10.1016/j.neucom.2021.03.003_b0165
  doi: 10.1109/ICCV.2019.00386
– ident: 10.1016/j.neucom.2021.03.003_b0220
  doi: 10.1109/CVPR.2017.544
– volume: 30
  start-page: 2244
  issue: 7
  year: 2018
  ident: 10.1016/j.neucom.2021.03.003_b0135
  article-title: Deep fishernet for image classification
  publication-title: IEEE Trans. Neural Netw. Learn. Syst.
  doi: 10.1109/TNNLS.2018.2874657
– year: 2008
  ident: 10.1016/j.neucom.2021.03.003_b0195
– ident: 10.1016/j.neucom.2021.03.003_b0105
  doi: 10.1109/CVPR.2018.00105
– volume: 29
  start-page: 773
  issue: 3
  year: 2019
  ident: 10.1016/j.neucom.2021.03.003_b0185
  article-title: Two-stream collaborative learning with spatial-temporal attention for video classification
  publication-title: IEEE Trans. Circuit Syst. Video Technol.
  doi: 10.1109/TCSVT.2018.2808685
– ident: 10.1016/j.neucom.2021.03.003_b0120
– ident: 10.1016/j.neucom.2021.03.003_b0225
  doi: 10.1109/CVPR.2016.350
– ident: 10.1016/j.neucom.2021.03.003_b0015
– ident: 10.1016/j.neucom.2021.03.003_b0170
  doi: 10.1109/CVPR.2018.00745
– volume: 29
  start-page: 3520
  issue: 1
  year: 2020
  ident: 10.1016/j.neucom.2021.03.003_b0280
  article-title: Semantic segmentation with context encoding and multi-path decoding
  publication-title: IEEE Trans. Image Process.
  doi: 10.1109/TIP.2019.2962685
– ident: 10.1016/j.neucom.2021.03.003_b0340
  doi: 10.1007/978-3-030-58520-4_1
– ident: 10.1016/j.neucom.2021.03.003_b0030
– ident: 10.1016/j.neucom.2021.03.003_b0155
  doi: 10.1109/CVPR42600.2020.01078
– ident: 10.1016/j.neucom.2021.03.003_b0045
  doi: 10.1109/CVPR.2017.660
– ident: 10.1016/j.neucom.2021.03.003_b0270
  doi: 10.1109/CVPR42600.2020.00897
– ident: 10.1016/j.neucom.2021.03.003_b0100
  doi: 10.1109/ICCV.2017.228
– volume: 384
  start-page: 182
  year: 2020
  ident: 10.1016/j.neucom.2021.03.003_b0310
  article-title: Dynamic attention network for semantic segmentation
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2019.12.042
– ident: 10.1016/j.neucom.2021.03.003_b0325
  doi: 10.1109/ICCV.2019.00195
– volume: 32
  start-page: 1271
  issue: 7
  year: 2010
  ident: 10.1016/j.neucom.2021.03.003_b0190
  article-title: Visual word ambiguity
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
  doi: 10.1109/TPAMI.2009.132
– ident: 10.1016/j.neucom.2021.03.003_b0020
  doi: 10.1109/CVPR.2009.5206848
– ident: 10.1016/j.neucom.2021.03.003_b0250
  doi: 10.1109/CVPR.2018.00254
– ident: 10.1016/j.neucom.2021.03.003_b0295
  doi: 10.1109/CVPR.2017.518
– ident: 10.1016/j.neucom.2021.03.003_b0065
  doi: 10.1109/CVPR.2019.00326
– ident: 10.1016/j.neucom.2021.03.003_b0025
  doi: 10.1109/CVPR.2015.7298965
– ident: 10.1016/j.neucom.2021.03.003_b0010
– ident: 10.1016/j.neucom.2021.03.003_b0235
  doi: 10.1109/CVPR.2019.00417
– ident: 10.1016/j.neucom.2021.03.003_b0240
  doi: 10.1109/CVPR.2017.549
– ident: 10.1016/j.neucom.2021.03.003_b0085
  doi: 10.1109/CVPR.2010.5540039
– ident: 10.1016/j.neucom.2021.03.003_b0330
  doi: 10.1109/CVPR.2019.00052
– ident: 10.1016/j.neucom.2021.03.003_b0110
  doi: 10.1109/CVPR.2019.00522
– ident: 10.1016/j.neucom.2021.03.003_b0255
  doi: 10.1109/CVPR.2019.00767
– ident: 10.1016/j.neucom.2021.03.003_b0070
  doi: 10.1109/ICCV.2019.00068
– volume: 88
  start-page: 303
  issue: 2
  year: 2010
  ident: 10.1016/j.neucom.2021.03.003_b0215
  article-title: The pascal visual object classes (voc) challenge
  publication-title: Int. J. Comput. Vis.
  doi: 10.1007/s11263-009-0275-4
SSID ssj0017129
Score 2.426681
Snippet Recently most of the state-of-the-art semantic segmentation methods have focused on context modeling for more accurate prediction. As real-world images often...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 50
SubjectTerms Context modeling
Covariance pooling
Multi-modal distributions
Semantic segmentation
Title Second-order encoding networks for semantic segmentation
URI https://dx.doi.org/10.1016/j.neucom.2021.03.003
Volume 445
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwED6VdmHhjSiPKgOrqWPHdjpWFVUBqUuL1C1yHAcV0VD1sfLbOSdOBRICiS2Jclb0Ob7vLrn7DHCrYqTJmGpiVJyTyOBSTLNIEo75dcwjTlleVluM5eg5epyJWQMGdS-MK6v0vr_y6aW39le6Hs3ucj7vTmiPYRYVMkxaKBcc_XCLIdvHTWj1H55G493PBBWySnKPCeIM6g66ssyrsFtXNsKQ6yq1U_4zQ31hneERHPhwMehXT3QMDVucwGG9FUPgV-YpxBOX2GakVNIMnDil46SgqIq81wGGpsHaLhDGucGDl4VvOSrOYDq8nw5GxG-KQAxG9xvEVBqeZ0KkViulaahD5PQehhFaOtFh5FyMSbRlUY6ZitSp5joPTU_GWqZC8HNoFu-FvYBA6dAqbZXMtI4EDsgoDoF8JWSucIrawGscEuMFw92-FW9JXRn2mlToJQ69hHInNNoGsrNaVoIZf9yvaoiTbxOfoE__1fLy35ZXsO_O3CdaRq-huVlt7Q3GFpu0A3t3H2HHv0GfmtvMjw
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3PT8IwFH4hcNCLv434cwevC127tuNIiGQIcgETbk23dQYjkwj8_75uHdHEaOJt2faa5ev6vve21-8B3MsIaTIi2k9llPthiksxyULhM8yvIxYyQvOy2mIi4ufwcc7nDejXe2FsWaXz_ZVPL721O9NxaHZWi0VnSroUs6iAYtJCGGfoh1uhbWrdhFZvOIonu58JMqCV5B7lvjWod9CVZV6F2dqyEYpcV6mdsp8Z6gvrDI7gwIWLXq96omNomOIEDutWDJ5bmacQTW1im_mlkqZnxSktJ3lFVeS99jA09dZmiTAuUjx4WbotR8UZzAYPs37su6YIforR_QYxFSnLM84To6XUJNABcnoXwwgtrOgwci7GJNrQMMdMRehEM50HaVdEWiScs3NoFu-FuQBP6sBIbaTItA45DkgJDoF8xUUucYrawGocVOoEw23fijdVV4a9qgo9ZdFThFmh0Tb4O6tVJZjxx_2yhlh9m3iFPv1Xy8t_W97BXjx7GqvxcDK6gn17xX6upeQampuPrbnBOGOT3Lr36BNxiM51
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Second-order+encoding+networks+for+semantic+segmentation&rft.jtitle=Neurocomputing+%28Amsterdam%29&rft.au=Sun%2C+Qiule&rft.au=Zhang%2C+Zhimin&rft.au=Li%2C+Peihua&rft.date=2021-07-20&rft.issn=0925-2312&rft.volume=445&rft.spage=50&rft.epage=60&rft_id=info:doi/10.1016%2Fj.neucom.2021.03.003&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_neucom_2021_03_003
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0925-2312&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0925-2312&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0925-2312&client=summon