Conditional Mutual Information Constrained Deep Learning for Classification
The concepts of conditional mutual information (CMI) and normalized CMI (NCMI) are introduced to measure the concentration and separation performance of a classification deep neural network (DNN) in the output probability distribution space of the DNN, where CMI and the ratio between CMI and NCMI re...
Saved in:
| Published in | IEEE transaction on neural networks and learning systems Vol. 36; no. 8; pp. 15436 - 15448 |
|---|---|
| Main Authors | , , , , |
| Format | Journal Article |
| Language | English |
| Published |
United States
IEEE
01.08.2025
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 2162-237X 2162-2388 2162-2388 |
| DOI | 10.1109/TNNLS.2025.3540014 |
Cover
| Summary: | The concepts of conditional mutual information (CMI) and normalized CMI (NCMI) are introduced to measure the concentration and separation performance of a classification deep neural network (DNN) in the output probability distribution space of the DNN, where CMI and the ratio between CMI and NCMI represent the intraclass concentration and interclass separation of the DNN, respectively. By using NCMI to evaluate popular DNNs pretrained over CIFAR-100 and ImageNet in the literature, it is shown that their validation accuracies are more or less inversely proportional to their NCMI values. Based on this observation, the standard deep learning (DL) framework is further modified to minimize the standard cross entropy (CE) function subject to an NCMI constraint, yielding CMI constrained DL (CMIC-DL). A novel alternating learning algorithm is proposed to solve such a constrained optimization problem. Extensive experimental results show that DNNs trained within CMIC-DL outperform the state-of-the-art models trained within the standard DL and other loss functions in the literature in terms of both accuracy and robustness against adversarial attacks. In addition, visualizing the evolution of the learning process through the lens of CMI and NCMI is also advocated. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| ISSN: | 2162-237X 2162-2388 2162-2388 |
| DOI: | 10.1109/TNNLS.2025.3540014 |