Cross-CAM: Focused Visual Explanations for Deep Convolutional Networks via Training-Set Tracing
In recent years, the widely used deep learning technologies have always been controversial in terms of reliability and credibility. Class Activation Map (CAM) has been proposed to explain the deep learning models. Existing CAM-based algorithms highlight critical portions of the input image, but they...
Saved in:
| Published in | Knowledge Science, Engineering and Management Vol. 13368; pp. 735 - 745 |
|---|---|
| Main Authors | , , , |
| Format | Book Chapter |
| Language | English |
| Published |
Switzerland
Springer International Publishing AG
2022
Springer International Publishing |
| Series | Lecture Notes in Computer Science |
| Subjects | |
| Online Access | Get full text |
| ISBN | 9783031109829 3031109821 |
| ISSN | 0302-9743 1611-3349 |
| DOI | 10.1007/978-3-031-10983-6_56 |
Cover
| Summary: | In recent years, the widely used deep learning technologies have always been controversial in terms of reliability and credibility. Class Activation Map (CAM) has been proposed to explain the deep learning models. Existing CAM-based algorithms highlight critical portions of the input image, but they don’t go any farther in tracing the neural network’s decision-basis. This work proposes Cross-CAM, a visual interpretation method which supports deep traceability for prediction-basis samples and focuses on similar regions of the category based on the input image and the prediction-basis samples. The Cross-CAM extracts deep discriminative feature vectors and screens out the prediction-basis samples from the training set. The similarity-weight and the grad-weight are then combined to form the cross-weight, which highlights similar regions and aids in classification decisions. On the ILSVRC-15 dataset, the proposed Cross-CAM is tested. The new weakly-supervised localization evaluation metric IoS (Intersection over Self) is proposed to effectively evaluate the focusing effect. Using Cross-CAM highlight regions, the top-1 location error for weakly-supervised localization achieves 44.95% on the ILSVRC-15 validation set, which is 16.25% lower than Grad-CAM. In comparison to Grad-CAM, Cross-CAM focuses on the key regions using the similarity between the test image and the prediction-basis samples, according to the visualisation results. |
|---|---|
| Bibliography: | Supported by the National Natural Science Foundation of China (32071775) and 2020 Industrial Internet Innovation and Development Project - Malicious Code Analysis Equipment Project of Security and Controlled System, No.: TC200H02X.First Author and Second Author contribute equally to this work. |
| ISBN: | 9783031109829 3031109821 |
| ISSN: | 0302-9743 1611-3349 |
| DOI: | 10.1007/978-3-031-10983-6_56 |