Second-order encoding networks for semantic segmentation

Recently most of the state-of-the-art semantic segmentation methods have focused on context modeling for more accurate prediction. As real-world images often contain multiple objects and stuff, image features may have complex and multi-modal distributions. However, existing methods do not fully cons...

Full description

Saved in:

Bibliographic Details
Published in	Neurocomputing (Amsterdam) Vol. 445; pp. 50 - 60
Main Authors	Sun, Qiule, Zhang, Zhimin, Li, Peihua
Format	Journal Article
Language	English
Published	Elsevier B.V 20.07.2021
Subjects	Context modeling Covariance pooling Multi-modal distributions Semantic segmentation Multi-modal distributions Covariance pooling Semantic segmentation Context modeling
Online Access	Get full text
ISSN	0925-2312 1872-8286
DOI	10.1016/j.neucom.2021.03.003

Cover

Abstract	Recently most of the state-of-the-art semantic segmentation methods have focused on context modeling for more accurate prediction. As real-world images often contain multiple objects and stuff, image features may have complex and multi-modal distributions. However, existing methods do not fully consider sch complex distributions, having limited capability for context modeling. Towards addressing this problem, this paper proposes a second-order encoding network (SoENet) trainable end-to-end for harvesting complex contextual knowledge. At the core of SoENet is an encoding module which can capture second-order statistics in individual feature subspaces. Specifically, we divide the entire feature space into a set of subspaces (clusters) represented by codewords, in each of which a covariance matrix is computed for second-order statistical modeling. The covariance matrices of all subspaces are concatenated to form a 3D tensor, which is then subject to convolutions and nonlinear activations and finally used for scaling of input features. In this way, we can encode the context which involves the complex distribution into learning process in an end-to-end manner. The proposed SoENet is evaluated on four commonly used challenging benchmarks, i.e., PASCAL Context, PASCAL VOC 2012, ADE20K and Cityscapes. The experiments show that our network significantly outperforms its counterparts and is competitive compared to state-of-the-art methods.
AbstractList	Recently most of the state-of-the-art semantic segmentation methods have focused on context modeling for more accurate prediction. As real-world images often contain multiple objects and stuff, image features may have complex and multi-modal distributions. However, existing methods do not fully consider sch complex distributions, having limited capability for context modeling. Towards addressing this problem, this paper proposes a second-order encoding network (SoENet) trainable end-to-end for harvesting complex contextual knowledge. At the core of SoENet is an encoding module which can capture second-order statistics in individual feature subspaces. Specifically, we divide the entire feature space into a set of subspaces (clusters) represented by codewords, in each of which a covariance matrix is computed for second-order statistical modeling. The covariance matrices of all subspaces are concatenated to form a 3D tensor, which is then subject to convolutions and nonlinear activations and finally used for scaling of input features. In this way, we can encode the context which involves the complex distribution into learning process in an end-to-end manner. The proposed SoENet is evaluated on four commonly used challenging benchmarks, i.e., PASCAL Context, PASCAL VOC 2012, ADE20K and Cityscapes. The experiments show that our network significantly outperforms its counterparts and is competitive compared to state-of-the-art methods.
Author	Sun, Qiule Zhang, Zhimin Li, Peihua
Author_xml	– sequence: 1 givenname: Qiule surname: Sun fullname: Sun, Qiule – sequence: 2 givenname: Zhimin surname: Zhang fullname: Zhang, Zhimin – sequence: 3 givenname: Peihua surname: Li fullname: Li, Peihua email: peihuali@dlut.edu.cn
BookMark	eNqFkMtOwzAQRS1UJNrCH7DIDySM7cRJWCChipdUiQXdW1NnUrk0NrINiL8nVVixgNXczbm6cxZs5rwjxi45FBy4utoXjt6NHwoBghcgCwB5wua8qUXeiEbN2BxaUeVCcnHGFjHuAXjNRTtnzQsZ77rch45CRs74zrpd5ih9-vAas96HLNKALlkzht1ALmGy3p2z0x4PkS5-7pJt7u82q8d8_fzwtLpd50aCSrmplZF9V1VbwrpG4Mh5C21TKlSqBQWi5LVAEmWvoFK4RYk9N61qUG2rSi7Z9VRrgo8xUK-NnQakgPagOeijAr3XkwJ9VKBB6lHBCJe_4LdgBwxf_2E3E0bjXx-Wgo7Gjmqos4FM0p23fxd8A8Jyejc
CitedBy_id	crossref_primary_10_1016_j_neucom_2023_03_006 crossref_primary_10_1016_j_eswa_2024_125465 crossref_primary_10_1007_s00521_023_08800_w crossref_primary_10_1109_ACCESS_2025_3529812 crossref_primary_10_1016_j_dcan_2023_05_010 crossref_primary_10_1002_nbm_4657 crossref_primary_10_17798_bitlisfen_1473041 crossref_primary_10_1007_s40747_023_01103_6 crossref_primary_10_1109_TPAMI_2022_3216339 crossref_primary_10_1007_s11063_023_11270_9 crossref_primary_10_1016_j_neucom_2021_11_056 crossref_primary_10_1029_2023JH000109 crossref_primary_10_1109_TGRS_2024_3373493 crossref_primary_10_3390_rs16173178
Cites_doi	10.1109/TPAMI.2017.2699184 10.1007/978-3-030-01219-9_37 10.1109/CVPR.2018.00199 10.1109/CVPR.2014.119 10.1007/s11432-019-2685-1 10.1109/CVPR.2019.00314 10.1007/978-3-030-58520-4_26 10.1109/CVPR.2019.00545 10.1109/CVPR.2018.00813 10.1109/CVPR.2017.195 10.1109/CVPR42600.2020.01155 10.1109/ICIP.2019.8803154 10.1109/CVPR.2018.00747 10.1109/TPAMI.2017.2711011 10.1109/ICCV.2015.170 10.1007/978-3-030-01228-1_26 10.1007/978-3-030-01234-2_1 10.1007/s11280-018-0556-3 10.1016/j.neucom.2017.12.020 10.1109/CVPR.2017.309 10.1016/S0047-259X(03)00096-4 10.1007/978-3-319-10602-1_48 10.1109/CVPR.2019.00324 10.1109/ICCV.2015.339 10.1109/CVPR.2019.00053 10.1109/CVPR42600.2020.01308 10.1109/CVPR.2016.480 10.1109/ICCV.2019.00386 10.1109/CVPR.2017.544 10.1109/TNNLS.2018.2874657 10.1109/CVPR.2018.00105 10.1109/TCSVT.2018.2808685 10.1109/CVPR.2016.350 10.1109/CVPR.2018.00745 10.1109/TIP.2019.2962685 10.1007/978-3-030-58520-4_1 10.1109/CVPR42600.2020.01078 10.1109/CVPR.2017.660 10.1109/CVPR42600.2020.00897 10.1109/ICCV.2017.228 10.1016/j.neucom.2019.12.042 10.1109/ICCV.2019.00195 10.1109/TPAMI.2009.132 10.1109/CVPR.2009.5206848 10.1109/CVPR.2018.00254 10.1109/CVPR.2017.518 10.1109/CVPR.2019.00326 10.1109/CVPR.2015.7298965 10.1109/CVPR.2019.00417 10.1109/CVPR.2017.549 10.1109/CVPR.2010.5540039 10.1109/CVPR.2019.00052 10.1109/CVPR.2019.00522 10.1109/CVPR.2019.00767 10.1109/ICCV.2019.00068 10.1007/s11263-009-0275-4
ContentType	Journal Article
Copyright	2021 Elsevier B.V.
Copyright_xml	– notice: 2021 Elsevier B.V.
DBID	AAYXX CITATION
DOI	10.1016/j.neucom.2021.03.003
DatabaseName	CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISSN	1872-8286
EndPage	60
ExternalDocumentID	10_1016_j_neucom_2021_03_003 S0925231221003532
GroupedDBID	--- --K --M .DC .~1 0R~ 123 1B1 1~. 1~5 4.4 457 4G. 53G 5VS 7-5 71M 8P~ 9JM 9JN AABNK AACTN AADPK AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAXLA AAXUO AAYFN ABBOA ABCQJ ABFNM ABJNI ABMAC ABYKQ ACDAQ ACGFS ACRLP ACZNC ADBBV ADEZE AEBSH AEKER AENEX AFKWA AFTJW AFXIZ AGHFR AGUBO AGWIK AGYEJ AHHHB AHZHX AIALX AIEXJ AIKHN AITUG AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD AXJTR BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EO8 EO9 EP2 EP3 F5P FDB FIRID FNPLU FYGXN G-Q GBLVA GBOLZ IHE J1W KOM LG9 M41 MO0 MOBAO N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 ROL RPZ SDF SDG SDP SES SPC SPCBC SSN SSV SSZ T5K ZMT ~G- 29N AAQXK AATTM AAXKI AAYWO AAYXX ABWVN ABXDB ACNNM ACRPL ACVFH ADCNI ADJOM ADMUD ADNMO AEIPS AEUPX AFJKZ AFPUW AGCQF AGQPQ AGRNS AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP ASPBG AVWKF AZFZN BNPGV CITATION EJD FEDTE FGOYB HLZ HVGLF HZ~ R2- RIG SBC SEW SSH WUQ XPP
ID	FETCH-LOGICAL-c306t-c76c3fd55bea77a01a11909846a66906024172ae24f6056aba3af1c968a6b553
IEDL.DBID	AIKHN
ISSN	0925-2312
IngestDate	Tue Jul 01 01:46:58 EDT 2025 Thu Apr 24 23:11:57 EDT 2025 Fri Feb 23 02:46:25 EST 2024
IsPeerReviewed	true
IsScholarly	true
Keywords	Multi-modal distributions Covariance pooling Semantic segmentation Context modeling
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c306t-c76c3fd55bea77a01a11909846a66906024172ae24f6056aba3af1c968a6b553
PageCount	11
ParticipantIDs	crossref_citationtrail_10_1016_j_neucom_2021_03_003 crossref_primary_10_1016_j_neucom_2021_03_003 elsevier_sciencedirect_doi_10_1016_j_neucom_2021_03_003
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2021-07-20
PublicationDateYYYYMMDD	2021-07-20
PublicationDate_xml	– month: 07 year: 2021 text: 2021-07-20 day: 20
PublicationDecade	2020
PublicationTitle	Neurocomputing (Amsterdam)
PublicationYear	2021
Publisher	Elsevier B.V
Publisher_xml	– name: Elsevier B.V
References	H. Shi, H. Li, Q. Wu, Z. Song, Scene parsing via integrated classification model and variance-based regularization, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 5307–5316. Wu, Chen, Jing, Hu, Ge, Ji (b0310) 2020; 384 X. Pan, X. Zhan, J. Shi, X. Tang, P. Luo, Switchable whitening for deep representation learning, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2019, pp. 1863–1871. Y. Chen, M. Rohrbach, Z. Yan, Y. Shuicheng, J. Feng, Y. Kalantidis, Graph-based global reasoning networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 433–442. R. Mottaghi, X. Chen, X. Liu, N.-G. Cho, S.-W. Lee, S. Fidler, R. Urtasun, A. Yuille, The role of context for object detection and semantic segmentation in the wild, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2014, pp. 891–898. L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A.L. Yuille, Semantic image segmentation with deep convolutional nets and fully connected crfs, in: Proc. Int. Conf. Learn. Represent. (ICLR), 2015. T.-Y. Lin, A. RoyChowdhury, S. Maji, Bilinear cnn models for fine-grained visual recognition, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2015, pp. 1449–1457. B. Bryan, Y. Gong, Y. Zhang, C. Poellabauer, Second-order non-local attention networks for person re-identification, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2019, pp. 3760–3769. A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, in: Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2012, pp. 1097–1105. J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, F.F. Li, Imagenet: A large-scale hierarchical image database, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2009, pp. 248–255. C. Yu, J. Wang, C. Peng, C. Gao, G. Yu, N. Sang, Learning a discriminative feature network for semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 1857–1866. P. Li, J. Xie, Q. Wang, W. Zuo, Is second-order information helpful for large-scale visual recognition?, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2017, pp. 2070–2078. M. Amirul Islam, M. Rochan, N.D.B. Bruce, Y. Wang, Gated feedback refinement network for dense image labeling, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 3751–3759. H. Zhang, J. Xue, K. Dana, Deep ten: Texture encoding network, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 2896–2905. Ledoit, Wolf (b0200) 2004; 88 H. Ding, X. Jiang, B. Shuai, A. Qun Liu, G. Wang, Context contrasted feature and gated multi-scale aggregation for scene segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 2393–2402. H. Hu, D. Ji, W. Gan, S. Bai, W. Wu, e. A. Yan, Junjie, H. Bischof, T. Brox, J.-M. Frahm, Class-wise dynamic graph convolution for semantic segmentation, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2020, pp. 435–452. Z. Zhu, M. Xu, S. Bai, T. Huang, X. Bai, Asymmetric non-local neural networks for semantic segmentation, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2019, pp. 593–602. K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, in: Proc. Int. Conf. Learn. Represent. (ICLR), 2015. X. Li, Y. Yang, Q. Zhao, T. Shen, Z. Lin, L. H, Spatial pyramid based graph reasoning for semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 8950–8959. B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso, A. Torralba, Scene parsing through ade20k dataset, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 5122–5130. F. Yu, V. Koltun, Multi-scale context aggregation by dilated convolutions, in: Proc. Int. Conf. Learn. Represent. (ICLR), 2016. P. Li, J. Xie, Q. Wang, Z. Gao, Towards faster training of global covariance pooling networks by iterative matrix square root normalization, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 947–955. M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, B. Schiele, The cityscapes dataset for semantic urban scene understanding, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2016, pp. 3213–3223. Ding, Jiang, Shuai, Liu, Wang (b0280) 2020; 29 S. Woo, J. Park, J.-Y. Lee, I. So Kweon, Cbam: Convolutional block attention module, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 3–19. Arandjelović, Gronat, Torii, Pajdla, Sivic (b0080) 2018; 40 Nets: Double attention networks, in: Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2018, pp. 352–361. W. Liu, A. Rabinovich, A.C. Berg, Parsenet: Looking wider to see better, in: Proc. Int. Conf. Learn. Represent. (ICLR), 2016. X. Wang, R. Girshick, A. Gupta, K. He, Non-local neural networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 7794–7803. Zhou, Yang, Gao, Ou, Lu, Chen, Latecki (b0290) 2019; 22 van der Maaten, Hinton (b0035) 2008; 9 Y. Wang, Q. Zhou, J. Liu, J. Xiong, G. Gao, X. Wu, L.J. Latecki, Lednet: A lightweight encoder-decoder network for real-time semantic segmentation, in: Proc. IEEE Int. Conf. Image Process. (ICIP), IEEE, 2019, pp. 1860–1864. T.-W. Ke, J.-J. Hwang, Z. Liu, S. Yu, Adaptive affinity fields for semantic segmentation, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 587–602. K.M. He, X.Y. Zhang, Q.S. Ren, J. Sun, Deep residual learning for image recognition, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2016, pp. 770–778. J. Hu, L. Shen, G. Sun, Squeeze-and-excitation networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 7132–7141. H. Zhao, J. Shi, X. Qi, X. Wang, J. Jia, Pyramid scene parsing network, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 6230–6239. Chen, Papandreou, Kokkinos, Murphy, Yuille (b0050) 2018; 40 Higham (b0195) 2008 van Gemert, Veenman, Smeulders, Geusebroek (b0190) 2010; 32 O.V. Catalin Ionescu, C. Sminchisescu, Matrix backpropagation for deep networks with structured layers, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2015, pp. 2965–2973. T. Xiao, Y. Liu, B. Zhou, Y. Jiang, J. Sun, Unified perceptual parsing for scene understanding, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 432–448. D. Lin, D. Shen, S. Shen, Y. Ji, D. Lischinski, D. Cohen-Or, H. Huang, Zigzagnet: Fusing top-down and bottom-up context for object segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 7490–7499. Q. Wang, B. Wu, P. Zhu, P. Li, W. Zuo, Q. Hu, Eca-net: Efficient channel attention for deep convolutional neural networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 11534–11542. Z. Zhong, Z.Q. Lin, R. Bidart, X. Hu, I.B. Daya, Z. Li, W. Zheng, J. Li, A. Wong, Squeeze-and-attention networks for semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 13065–13074. Q. Wang, P. Li, Q. Hu, P. Zhu, W. Zuo, Deep global generalized gaussian networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 5080–5088. J. Fu, J. Liu, H. Tian, Y. Li, Y. Bao, Z. Fang, H. Lu, Dual attention network for scene segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 3146–3154. J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2015, pp. 3431–3440. Zhou, Wang, Liu, Jin, Latecki (b0125) 2019; 62 Q. Wang, L. Zhang, B. Wu, D. Ren, P. Li, W. Zuo, Q. Hu, What deep cnns benefit from global covariance pooling: An optimization perspective, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 10771–10780. Z. Tian, T. He, C. Shen, Y. Yan, Decoders matter for semantic segmentation: Data-dependent decoding enables flexible feature aggregation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 3126–3135. Peng, Zhao, Zhang (b0185) 2019; 29 D. Lin, Y. Ji, D. Lischinski, D. Cohen-Or, H. Huang, Multi-scale context intertwining for semantic segmentation, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 322–638. Everingham, Van Gool, Williams, Winn, Zisserman (b0215) 2010; 88 A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, Pytorch: An imperative style, high-performance deep learning library, in: Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2019, pp. 8026–8037. X. Li, X. Li, L. Zhang, G. Cheng, J. Shi, Z. Lin, S. Tan, e. A. Tong, Yunhai, H. Bischof, T. Brox, J.-M. Frahm, Improving semantic segmentation via decoupled body and edge supervision, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2020, pp. 435–452. Tang, Wang, Shi, Bai, Liu, Tu (b0135) 2018; 30 S. Zhao, Y. Wang, Z. Yang, D. Cai, Region mutual information loss for semantic segmentation, in: Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2019, pp. 11117–11127. H. Zhang, K. Dana, J. Shi, Z. Zhang, X. Wang, A. Tyagi, A. Agrawal, Context encoding for semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 7151–7160. Sun, Wang, Zhang, Li (b0095) 2018; 282 Y. Chen, Y. Kalantidis, J. Li, S. Yan, J. Feng F. Chollet, Xception: Deep learning with depthwise separable convolutions, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 1800–1807. W. Shao, T. Meng, J. Li, R. Zhang, Y. Li, X. Wang, P. Luo, Ssn: Learning sparse switchable normalization via sparsestmax, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 443–451. Y. Zhou, X. Sun, Z.-J. Zha, W. Zeng, Context-reinforced semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 4046–4055. T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C.L. Zitnick, Microsoft coco: Common objects in context, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2014, pp. 740–755. H. Jégou, M. Douze, C. Schmid, P. Pérez, Aggregating local descriptors into a compact image representation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2010, pp. 3304–3311. G. Lin, A. Milan, C. Shen, I.D. Reid, Refinene 10.1016/j.neucom.2021.03.003_b0235 10.1016/j.neucom.2021.03.003_b0115 10.1016/j.neucom.2021.03.003_b0110 10.1016/j.neucom.2021.03.003_b0275 10.1016/j.neucom.2021.03.003_b0230 10.1016/j.neucom.2021.03.003_b0155 Wu (10.1016/j.neucom.2021.03.003_b0310) 2020; 384 Tang (10.1016/j.neucom.2021.03.003_b0135) 2018; 30 10.1016/j.neucom.2021.03.003_b0315 Zhou (10.1016/j.neucom.2021.03.003_b0290) 2019; 22 10.1016/j.neucom.2021.03.003_b0150 10.1016/j.neucom.2021.03.003_b0270 10.1016/j.neucom.2021.03.003_b0075 10.1016/j.neucom.2021.03.003_b0030 10.1016/j.neucom.2021.03.003_b0070 10.1016/j.neucom.2021.03.003_b0025 van Gemert (10.1016/j.neucom.2021.03.003_b0190) 2010; 32 10.1016/j.neucom.2021.03.003_b0300 10.1016/j.neucom.2021.03.003_b0105 10.1016/j.neucom.2021.03.003_b0225 10.1016/j.neucom.2021.03.003_b0220 10.1016/j.neucom.2021.03.003_b0065 10.1016/j.neucom.2021.03.003_b0340 10.1016/j.neucom.2021.03.003_b0145 10.1016/j.neucom.2021.03.003_b0100 10.1016/j.neucom.2021.03.003_b0265 10.1016/j.neucom.2021.03.003_b0305 10.1016/j.neucom.2021.03.003_b0260 10.1016/j.neucom.2021.03.003_b0020 10.1016/j.neucom.2021.03.003_b0140 van der Maaten (10.1016/j.neucom.2021.03.003_b0035) 2008; 9 10.1016/j.neucom.2021.03.003_b0060 10.1016/j.neucom.2021.03.003_b0180 10.1016/j.neucom.2021.03.003_b0015 Arandjelović (10.1016/j.neucom.2021.03.003_b0080) 2018; 40 10.1016/j.neucom.2021.03.003_b0335 10.1016/j.neucom.2021.03.003_b0055 10.1016/j.neucom.2021.03.003_b0330 10.1016/j.neucom.2021.03.003_b0010 10.1016/j.neucom.2021.03.003_b0175 10.1016/j.neucom.2021.03.003_b0255 10.1016/j.neucom.2021.03.003_b0210 Sun (10.1016/j.neucom.2021.03.003_b0095) 2018; 282 Higham (10.1016/j.neucom.2021.03.003_b0195) 2008 Ledoit (10.1016/j.neucom.2021.03.003_b0200) 2004; 88 Everingham (10.1016/j.neucom.2021.03.003_b0215) 2010; 88 Chen (10.1016/j.neucom.2021.03.003_b0050) 2018; 40 10.1016/j.neucom.2021.03.003_b0130 10.1016/j.neucom.2021.03.003_b0295 10.1016/j.neucom.2021.03.003_b0250 Peng (10.1016/j.neucom.2021.03.003_b0185) 2019; 29 10.1016/j.neucom.2021.03.003_b0090 10.1016/j.neucom.2021.03.003_b0170 10.1016/j.neucom.2021.03.003_b0245 10.1016/j.neucom.2021.03.003_b0325 10.1016/j.neucom.2021.03.003_b0005 10.1016/j.neucom.2021.03.003_b0165 10.1016/j.neucom.2021.03.003_b0120 10.1016/j.neucom.2021.03.003_b0285 10.1016/j.neucom.2021.03.003_b0045 10.1016/j.neucom.2021.03.003_b0320 10.1016/j.neucom.2021.03.003_b0205 Zhou (10.1016/j.neucom.2021.03.003_b0125) 2019; 62 Ding (10.1016/j.neucom.2021.03.003_b0280) 2020; 29 10.1016/j.neucom.2021.03.003_b0040 10.1016/j.neucom.2021.03.003_b0160 10.1016/j.neucom.2021.03.003_b0240 10.1016/j.neucom.2021.03.003_b0085
References_xml	– volume: 29 start-page: 773 year: 2019 end-page: 786 ident: b0185 article-title: Two-stream collaborative learning with spatial-temporal attention for video classification publication-title: IEEE Trans. Circuit Syst. Video Technol. – reference: G. Lin, A. Milan, C. Shen, I.D. Reid, Refinenet: Multi-path refinement networks for high-resolution semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 5168–5177. – reference: X. Li, X. Li, L. Zhang, G. Cheng, J. Shi, Z. Lin, S. Tan, e. A. Tong, Yunhai, H. Bischof, T. Brox, J.-M. Frahm, Improving semantic segmentation via decoupled body and edge supervision, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2020, pp. 435–452. – reference: W. Liu, A. Rabinovich, A.C. Berg, Parsenet: Looking wider to see better, in: Proc. Int. Conf. Learn. Represent. (ICLR), 2016. – reference: Q. Wang, P. Li, Q. Hu, P. Zhu, W. Zuo, Deep global generalized gaussian networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 5080–5088. – reference: H. Shi, H. Li, Q. Wu, Z. Song, Scene parsing via integrated classification model and variance-based regularization, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 5307–5316. – volume: 40 start-page: 1437 year: 2018 end-page: 1451 ident: b0080 article-title: Netvlad: Cnn architecture for weakly supervised place recognition publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – reference: Y. Zhou, X. Sun, Z.-J. Zha, W. Zeng, Context-reinforced semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 4046–4055. – reference: J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, F.F. Li, Imagenet: A large-scale hierarchical image database, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2009, pp. 248–255. – reference: R. Mottaghi, X. Chen, X. Liu, N.-G. Cho, S.-W. Lee, S. Fidler, R. Urtasun, A. Yuille, The role of context for object detection and semantic segmentation in the wild, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2014, pp. 891–898. – reference: L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A.L. Yuille, Semantic image segmentation with deep convolutional nets and fully connected crfs, in: Proc. Int. Conf. Learn. Represent. (ICLR), 2015. – reference: Q. Wang, P. Li, W. Zuo, L. Zhang, Raid-g: Robust estimation of approximate infinite dimensional gaussian with application to material recognition, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2016, pp. 4433–4441. – reference: S. Zhao, Y. Wang, Z. Yang, D. Cai, Region mutual information loss for semantic segmentation, in: Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2019, pp. 11117–11127. – reference: K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, in: Proc. Int. Conf. Learn. Represent. (ICLR), 2015. – reference: X. Wang, R. Girshick, A. Gupta, K. He, Non-local neural networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 7794–7803. – reference: B. Bryan, Y. Gong, Y. Zhang, C. Poellabauer, Second-order non-local attention networks for person re-identification, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2019, pp. 3760–3769. – reference: H. Zhao, J. Shi, X. Qi, X. Wang, J. Jia, Pyramid scene parsing network, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 6230–6239. – reference: A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, in: Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2012, pp. 1097–1105. – volume: 9 start-page: 2579 year: 2008 end-page: 2605 ident: b0035 article-title: Visualizing data using t-sne publication-title: J. Mach. Learn. Res. – reference: Z. Zhu, M. Xu, S. Bai, T. Huang, X. Bai, Asymmetric non-local neural networks for semantic segmentation, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2019, pp. 593–602. – reference: H. Zhang, J. Xue, K. Dana, Deep ten: Texture encoding network, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 2896–2905. – reference: F. Yu, V. Koltun, Multi-scale context aggregation by dilated convolutions, in: Proc. Int. Conf. Learn. Represent. (ICLR), 2016. – reference: X. Li, Y. Yang, Q. Zhao, T. Shen, Z. Lin, L. H, Spatial pyramid based graph reasoning for semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 8950–8959. – reference: P. Li, J. Xie, Q. Wang, W. Zuo, Is second-order information helpful for large-scale visual recognition?, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2017, pp. 2070–2078. – reference: H. Ding, X. Jiang, B. Shuai, A. Qun Liu, G. Wang, Context contrasted feature and gated multi-scale aggregation for scene segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 2393–2402. – reference: D. Lin, Y. Ji, D. Lischinski, D. Cohen-Or, H. Huang, Multi-scale context intertwining for semantic segmentation, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 322–638. – volume: 88 start-page: 303 year: 2010 end-page: 338 ident: b0215 article-title: The pascal visual object classes (voc) challenge publication-title: Int. J. Comput. Vis. – volume: 40 start-page: 834 year: 2018 end-page: 848 ident: b0050 article-title: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – reference: J. Fu, J. Liu, H. Tian, Y. Li, Y. Bao, Z. Fang, H. Lu, Dual attention network for scene segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 3146–3154. – reference: Y. Chen, M. Rohrbach, Z. Yan, Y. Shuicheng, J. Feng, Y. Kalantidis, Graph-based global reasoning networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 433–442. – reference: Z. Zhong, Z.Q. Lin, R. Bidart, X. Hu, I.B. Daya, Z. Li, W. Zheng, J. Li, A. Wong, Squeeze-and-attention networks for semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 13065–13074. – reference: D. Lin, D. Shen, S. Shen, Y. Ji, D. Lischinski, D. Cohen-Or, H. Huang, Zigzagnet: Fusing top-down and bottom-up context for object segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 7490–7499. – volume: 22 start-page: 555 year: 2019 end-page: 570 ident: b0290 article-title: Multi-scale deep context convolutional neural networks for semantic segmentation publication-title: World Wide Web – volume: 88 start-page: 365 year: 2004 end-page: 411 ident: b0200 article-title: A well-conditioned estimator for large-dimensional covariance matrices publication-title: J. Multivar. Anal. – volume: 32 start-page: 1271 year: 2010 end-page: 1283 ident: b0190 article-title: Visual word ambiguity publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – reference: B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso, A. Torralba, Scene parsing through ade20k dataset, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 5122–5130. – reference: -Nets: Double attention networks, in: Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2018, pp. 352–361. – reference: T.-Y. Lin, A. RoyChowdhury, S. Maji, Bilinear cnn models for fine-grained visual recognition, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2015, pp. 1449–1457. – reference: Q. Wang, B. Wu, P. Zhu, P. Li, W. Zuo, Q. Hu, Eca-net: Efficient channel attention for deep convolutional neural networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 11534–11542. – reference: X. Pan, X. Zhan, J. Shi, X. Tang, P. Luo, Switchable whitening for deep representation learning, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2019, pp. 1863–1871. – volume: 282 start-page: 174 year: 2018 end-page: 183 ident: b0095 article-title: Hyperlayer bilinear pooling with application to fine-grained categorization and image retrieval publication-title: Neurocomputing – reference: O.V. Catalin Ionescu, C. Sminchisescu, Matrix backpropagation for deep networks with structured layers, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2015, pp. 2965–2973. – reference: T. Xiao, Y. Liu, B. Zhou, Y. Jiang, J. Sun, Unified perceptual parsing for scene understanding, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 432–448. – volume: 29 start-page: 3520 year: 2020 end-page: 3533 ident: b0280 article-title: Semantic segmentation with context encoding and multi-path decoding publication-title: IEEE Trans. Image Process. – reference: A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, Pytorch: An imperative style, high-performance deep learning library, in: Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2019, pp. 8026–8037. – reference: S. Woo, J. Park, J.-Y. Lee, I. So Kweon, Cbam: Convolutional block attention module, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 3–19. – reference: Q. Wang, L. Zhang, B. Wu, D. Ren, P. Li, W. Zuo, Q. Hu, What deep cnns benefit from global covariance pooling: An optimization perspective, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 10771–10780. – reference: J. Hu, L. Shen, G. Sun, Squeeze-and-excitation networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 7132–7141. – reference: W. Shao, T. Meng, J. Li, R. Zhang, Y. Li, X. Wang, P. Luo, Ssn: Learning sparse switchable normalization via sparsestmax, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 443–451. – reference: Y. Wang, Q. Zhou, J. Liu, J. Xiong, G. Gao, X. Wu, L.J. Latecki, Lednet: A lightweight encoder-decoder network for real-time semantic segmentation, in: Proc. IEEE Int. Conf. Image Process. (ICIP), IEEE, 2019, pp. 1860–1864. – reference: K.M. He, X.Y. Zhang, Q.S. Ren, J. Sun, Deep residual learning for image recognition, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2016, pp. 770–778. – reference: M. Amirul Islam, M. Rochan, N.D.B. Bruce, Y. Wang, Gated feedback refinement network for dense image labeling, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 3751–3759. – reference: H. Jégou, M. Douze, C. Schmid, P. Pérez, Aggregating local descriptors into a compact image representation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2010, pp. 3304–3311. – reference: H. Zhang, K. Dana, J. Shi, Z. Zhang, X. Wang, A. Tyagi, A. Agrawal, Context encoding for semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 7151–7160. – reference: C. Yu, J. Wang, C. Peng, C. Gao, G. Yu, N. Sang, Learning a discriminative feature network for semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 1857–1866. – reference: Z. Tian, T. He, C. Shen, Y. Yan, Decoders matter for semantic segmentation: Data-dependent decoding enables flexible feature aggregation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 3126–3135. – reference: T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C.L. Zitnick, Microsoft coco: Common objects in context, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2014, pp. 740–755. – volume: 384 start-page: 182 year: 2020 end-page: 191 ident: b0310 article-title: Dynamic attention network for semantic segmentation publication-title: Neurocomputing – reference: M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, B. Schiele, The cityscapes dataset for semantic urban scene understanding, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2016, pp. 3213–3223. – reference: P. Li, J. Xie, Q. Wang, Z. Gao, Towards faster training of global covariance pooling networks by iterative matrix square root normalization, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 947–955. – reference: T.-W. Ke, J.-J. Hwang, Z. Liu, S. Yu, Adaptive affinity fields for semantic segmentation, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 587–602. – reference: F. Chollet, Xception: Deep learning with depthwise separable convolutions, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 1800–1807. – reference: H. Hu, D. Ji, W. Gan, S. Bai, W. Wu, e. A. Yan, Junjie, H. Bischof, T. Brox, J.-M. Frahm, Class-wise dynamic graph convolution for semantic segmentation, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2020, pp. 435–452. – volume: 62 start-page: 227101 year: 2019 end-page: 227102 ident: b0125 article-title: An open-source project for real-time image semantic segmentation publication-title: Sci. China-Inf. Sci. – reference: Z. Gao, J. Xie, Q. Wang, P. Li, Global second-order pooling convolutional networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 3024–3033. – reference: Y. Chen, Y. Kalantidis, J. Li, S. Yan, J. Feng, – reference: J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2015, pp. 3431–3440. – volume: 30 start-page: 2244 year: 2018 end-page: 2250 ident: b0135 article-title: Deep fishernet for image classification publication-title: IEEE Trans. Neural Netw. Learn. Syst. – year: 2008 ident: b0195 article-title: Functions of Matrices: Theory and Computation – volume: 40 start-page: 834 issue: 4 year: 2018 ident: 10.1016/j.neucom.2021.03.003_b0050 article-title: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2017.2699184 – ident: 10.1016/j.neucom.2021.03.003_b0245 doi: 10.1007/978-3-030-01219-9_37 – ident: 10.1016/j.neucom.2021.03.003_b0060 doi: 10.1109/CVPR.2018.00199 – ident: 10.1016/j.neucom.2021.03.003_b0210 doi: 10.1109/CVPR.2014.119 – volume: 62 start-page: 227101 issue: 12 year: 2019 ident: 10.1016/j.neucom.2021.03.003_b0125 article-title: An open-source project for real-time image semantic segmentation publication-title: Sci. China-Inf. Sci. doi: 10.1007/s11432-019-2685-1 – ident: 10.1016/j.neucom.2021.03.003_b0150 doi: 10.1109/CVPR.2019.00314 – ident: 10.1016/j.neucom.2021.03.003_b0335 doi: 10.1007/978-3-030-58520-4_26 – ident: 10.1016/j.neucom.2021.03.003_b0265 doi: 10.1109/CVPR.2019.00545 – ident: 10.1016/j.neucom.2021.03.003_b0075 doi: 10.1109/CVPR.2018.00813 – ident: 10.1016/j.neucom.2021.03.003_b0285 doi: 10.1109/CVPR.2017.195 – ident: 10.1016/j.neucom.2021.03.003_b0180 doi: 10.1109/CVPR42600.2020.01155 – ident: 10.1016/j.neucom.2021.03.003_b0130 doi: 10.1109/ICIP.2019.8803154 – ident: 10.1016/j.neucom.2021.03.003_b0055 doi: 10.1109/CVPR.2018.00747 – volume: 40 start-page: 1437 issue: 6 year: 2018 ident: 10.1016/j.neucom.2021.03.003_b0080 article-title: Netvlad: Cnn architecture for weakly supervised place recognition publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2017.2711011 – ident: 10.1016/j.neucom.2021.03.003_b0160 – ident: 10.1016/j.neucom.2021.03.003_b0145 doi: 10.1109/ICCV.2015.170 – ident: 10.1016/j.neucom.2021.03.003_b0315 doi: 10.1007/978-3-030-01228-1_26 – ident: 10.1016/j.neucom.2021.03.003_b0175 doi: 10.1007/978-3-030-01234-2_1 – volume: 22 start-page: 555 issue: 2 year: 2019 ident: 10.1016/j.neucom.2021.03.003_b0290 article-title: Multi-scale deep context convolutional neural networks for semantic segmentation publication-title: World Wide Web doi: 10.1007/s11280-018-0556-3 – ident: 10.1016/j.neucom.2021.03.003_b0305 – ident: 10.1016/j.neucom.2021.03.003_b0005 – volume: 282 start-page: 174 year: 2018 ident: 10.1016/j.neucom.2021.03.003_b0095 article-title: Hyperlayer bilinear pooling with application to fine-grained categorization and image retrieval publication-title: Neurocomputing doi: 10.1016/j.neucom.2017.12.020 – volume: 9 start-page: 2579 issue: 86 year: 2008 ident: 10.1016/j.neucom.2021.03.003_b0035 article-title: Visualizing data using t-sne publication-title: J. Mach. Learn. Res. – ident: 10.1016/j.neucom.2021.03.003_b0090 doi: 10.1109/CVPR.2017.309 – ident: 10.1016/j.neucom.2021.03.003_b0230 – ident: 10.1016/j.neucom.2021.03.003_b0040 – volume: 88 start-page: 365 issue: 2 year: 2004 ident: 10.1016/j.neucom.2021.03.003_b0200 article-title: A well-conditioned estimator for large-dimensional covariance matrices publication-title: J. Multivar. Anal. doi: 10.1016/S0047-259X(03)00096-4 – ident: 10.1016/j.neucom.2021.03.003_b0275 doi: 10.1007/978-3-319-10602-1_48 – ident: 10.1016/j.neucom.2021.03.003_b0260 doi: 10.1109/CVPR.2019.00324 – ident: 10.1016/j.neucom.2021.03.003_b0140 doi: 10.1109/ICCV.2015.339 – ident: 10.1016/j.neucom.2021.03.003_b0320 doi: 10.1109/CVPR.2019.00053 – ident: 10.1016/j.neucom.2021.03.003_b0115 doi: 10.1109/CVPR42600.2020.01308 – ident: 10.1016/j.neucom.2021.03.003_b0300 – ident: 10.1016/j.neucom.2021.03.003_b0205 doi: 10.1109/CVPR.2016.480 – ident: 10.1016/j.neucom.2021.03.003_b0165 doi: 10.1109/ICCV.2019.00386 – ident: 10.1016/j.neucom.2021.03.003_b0220 doi: 10.1109/CVPR.2017.544 – volume: 30 start-page: 2244 issue: 7 year: 2018 ident: 10.1016/j.neucom.2021.03.003_b0135 article-title: Deep fishernet for image classification publication-title: IEEE Trans. Neural Netw. Learn. Syst. doi: 10.1109/TNNLS.2018.2874657 – year: 2008 ident: 10.1016/j.neucom.2021.03.003_b0195 – ident: 10.1016/j.neucom.2021.03.003_b0105 doi: 10.1109/CVPR.2018.00105 – volume: 29 start-page: 773 issue: 3 year: 2019 ident: 10.1016/j.neucom.2021.03.003_b0185 article-title: Two-stream collaborative learning with spatial-temporal attention for video classification publication-title: IEEE Trans. Circuit Syst. Video Technol. doi: 10.1109/TCSVT.2018.2808685 – ident: 10.1016/j.neucom.2021.03.003_b0120 – ident: 10.1016/j.neucom.2021.03.003_b0225 doi: 10.1109/CVPR.2016.350 – ident: 10.1016/j.neucom.2021.03.003_b0015 – ident: 10.1016/j.neucom.2021.03.003_b0170 doi: 10.1109/CVPR.2018.00745 – volume: 29 start-page: 3520 issue: 1 year: 2020 ident: 10.1016/j.neucom.2021.03.003_b0280 article-title: Semantic segmentation with context encoding and multi-path decoding publication-title: IEEE Trans. Image Process. doi: 10.1109/TIP.2019.2962685 – ident: 10.1016/j.neucom.2021.03.003_b0340 doi: 10.1007/978-3-030-58520-4_1 – ident: 10.1016/j.neucom.2021.03.003_b0030 – ident: 10.1016/j.neucom.2021.03.003_b0155 doi: 10.1109/CVPR42600.2020.01078 – ident: 10.1016/j.neucom.2021.03.003_b0045 doi: 10.1109/CVPR.2017.660 – ident: 10.1016/j.neucom.2021.03.003_b0270 doi: 10.1109/CVPR42600.2020.00897 – ident: 10.1016/j.neucom.2021.03.003_b0100 doi: 10.1109/ICCV.2017.228 – volume: 384 start-page: 182 year: 2020 ident: 10.1016/j.neucom.2021.03.003_b0310 article-title: Dynamic attention network for semantic segmentation publication-title: Neurocomputing doi: 10.1016/j.neucom.2019.12.042 – ident: 10.1016/j.neucom.2021.03.003_b0325 doi: 10.1109/ICCV.2019.00195 – volume: 32 start-page: 1271 issue: 7 year: 2010 ident: 10.1016/j.neucom.2021.03.003_b0190 article-title: Visual word ambiguity publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2009.132 – ident: 10.1016/j.neucom.2021.03.003_b0020 doi: 10.1109/CVPR.2009.5206848 – ident: 10.1016/j.neucom.2021.03.003_b0250 doi: 10.1109/CVPR.2018.00254 – ident: 10.1016/j.neucom.2021.03.003_b0295 doi: 10.1109/CVPR.2017.518 – ident: 10.1016/j.neucom.2021.03.003_b0065 doi: 10.1109/CVPR.2019.00326 – ident: 10.1016/j.neucom.2021.03.003_b0025 doi: 10.1109/CVPR.2015.7298965 – ident: 10.1016/j.neucom.2021.03.003_b0010 – ident: 10.1016/j.neucom.2021.03.003_b0235 doi: 10.1109/CVPR.2019.00417 – ident: 10.1016/j.neucom.2021.03.003_b0240 doi: 10.1109/CVPR.2017.549 – ident: 10.1016/j.neucom.2021.03.003_b0085 doi: 10.1109/CVPR.2010.5540039 – ident: 10.1016/j.neucom.2021.03.003_b0330 doi: 10.1109/CVPR.2019.00052 – ident: 10.1016/j.neucom.2021.03.003_b0110 doi: 10.1109/CVPR.2019.00522 – ident: 10.1016/j.neucom.2021.03.003_b0255 doi: 10.1109/CVPR.2019.00767 – ident: 10.1016/j.neucom.2021.03.003_b0070 doi: 10.1109/ICCV.2019.00068 – volume: 88 start-page: 303 issue: 2 year: 2010 ident: 10.1016/j.neucom.2021.03.003_b0215 article-title: The pascal visual object classes (voc) challenge publication-title: Int. J. Comput. Vis. doi: 10.1007/s11263-009-0275-4
SSID	ssj0017129
Score	2.426681
Snippet	Recently most of the state-of-the-art semantic segmentation methods have focused on context modeling for more accurate prediction. As real-world images often...
SourceID	crossref elsevier
SourceType	Enrichment Source Index Database Publisher
StartPage	50
SubjectTerms	Context modeling Covariance pooling Multi-modal distributions Semantic segmentation
Title	Second-order encoding networks for semantic segmentation
URI	https://dx.doi.org/10.1016/j.neucom.2021.03.003
Volume	445
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwED6VdmHhjSiPKgOrqWPHdjpWFVUBqUuL1C1yHAcV0VD1sfLbOSdOBRICiS2Jclb0Ob7vLrn7DHCrYqTJmGpiVJyTyOBSTLNIEo75dcwjTlleVluM5eg5epyJWQMGdS-MK6v0vr_y6aW39le6Hs3ucj7vTmiPYRYVMkxaKBcc_XCLIdvHTWj1H55G493PBBWySnKPCeIM6g66ssyrsFtXNsKQ6yq1U_4zQ31hneERHPhwMehXT3QMDVucwGG9FUPgV-YpxBOX2GakVNIMnDil46SgqIq81wGGpsHaLhDGucGDl4VvOSrOYDq8nw5GxG-KQAxG9xvEVBqeZ0KkViulaahD5PQehhFaOtFh5FyMSbRlUY6ZitSp5joPTU_GWqZC8HNoFu-FvYBA6dAqbZXMtI4EDsgoDoF8JWSucIrawGscEuMFw92-FW9JXRn2mlToJQ69hHInNNoGsrNaVoIZf9yvaoiTbxOfoE__1fLy35ZXsO_O3CdaRq-huVlt7Q3GFpu0A3t3H2HHv0GfmtvMjw
linkProvider	Elsevier
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3PT8IwFH4hcNCLv434cwevC127tuNIiGQIcgETbk23dQYjkwj8_75uHdHEaOJt2faa5ev6vve21-8B3MsIaTIi2k9llPthiksxyULhM8yvIxYyQvOy2mIi4ufwcc7nDejXe2FsWaXz_ZVPL721O9NxaHZWi0VnSroUs6iAYtJCGGfoh1uhbWrdhFZvOIonu58JMqCV5B7lvjWod9CVZV6F2dqyEYpcV6mdsp8Z6gvrDI7gwIWLXq96omNomOIEDutWDJ5bmacQTW1im_mlkqZnxSktJ3lFVeS99jA09dZmiTAuUjx4WbotR8UZzAYPs37su6YIforR_QYxFSnLM84To6XUJNABcnoXwwgtrOgwci7GJNrQMMdMRehEM50HaVdEWiScs3NoFu-FuQBP6sBIbaTItA45DkgJDoF8xUUucYrawGocVOoEw23fijdVV4a9qgo9ZdFThFmh0Tb4O6tVJZjxx_2yhlh9m3iFPv1Xy8t_W97BXjx7GqvxcDK6gn17xX6upeQampuPrbnBOGOT3Lr36BNxiM51
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Second-order+encoding+networks+for+semantic+segmentation&rft.jtitle=Neurocomputing+%28Amsterdam%29&rft.au=Sun%2C+Qiule&rft.au=Zhang%2C+Zhimin&rft.au=Li%2C+Peihua&rft.date=2021-07-20&rft.issn=0925-2312&rft.volume=445&rft.spage=50&rft.epage=60&rft_id=info:doi/10.1016%2Fj.neucom.2021.03.003&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_neucom_2021_03_003
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0925-2312&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0925-2312&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0925-2312&client=summon