Cross-CAM: Focused Visual Explanations for Deep Convolutional Networks via Training-Set Tracing

In recent years, the widely used deep learning technologies have always been controversial in terms of reliability and credibility. Class Activation Map (CAM) has been proposed to explain the deep learning models. Existing CAM-based algorithms highlight critical portions of the input image, but they...

Full description

Saved in:

Bibliographic Details
Published in	Knowledge Science, Engineering and Management Vol. 13368; pp. 735 - 745
Main Authors	Sun, Yu, Ma, Kailang, Liu, Xuanxin, Cui, Jian
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2022 Springer International Publishing
Series	Lecture Notes in Computer Science
Subjects	CAM Interpretability Prediction-decision sample Traceability
Online Access	Get full text
ISBN	9783031109829 3031109821
ISSN	0302-9743 1611-3349
DOI	10.1007/978-3-031-10983-6_56

Cover

More Information
Summary:	In recent years, the widely used deep learning technologies have always been controversial in terms of reliability and credibility. Class Activation Map (CAM) has been proposed to explain the deep learning models. Existing CAM-based algorithms highlight critical portions of the input image, but they don’t go any farther in tracing the neural network’s decision-basis. This work proposes Cross-CAM, a visual interpretation method which supports deep traceability for prediction-basis samples and focuses on similar regions of the category based on the input image and the prediction-basis samples. The Cross-CAM extracts deep discriminative feature vectors and screens out the prediction-basis samples from the training set. The similarity-weight and the grad-weight are then combined to form the cross-weight, which highlights similar regions and aids in classification decisions. On the ILSVRC-15 dataset, the proposed Cross-CAM is tested. The new weakly-supervised localization evaluation metric IoS (Intersection over Self) is proposed to effectively evaluate the focusing effect. Using Cross-CAM highlight regions, the top-1 location error for weakly-supervised localization achieves 44.95% on the ILSVRC-15 validation set, which is 16.25% lower than Grad-CAM. In comparison to Grad-CAM, Cross-CAM focuses on the key regions using the similarity between the test image and the prediction-basis samples, according to the visualisation results.
Bibliography:	Supported by the National Natural Science Foundation of China (32071775) and 2020 Industrial Internet Innovation and Development Project - Malicious Code Analysis Equipment Project of Security and Controlled System, No.: TC200H02X.First Author and Second Author contribute equally to this work.
ISBN:	9783031109829 3031109821
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-031-10983-6_56