Conditional Mutual Information Constrained Deep Learning for Classification

The concepts of conditional mutual information (CMI) and normalized CMI (NCMI) are introduced to measure the concentration and separation performance of a classification deep neural network (DNN) in the output probability distribution space of the DNN, where CMI and the ratio between CMI and NCMI re...

Full description

Saved in:
Bibliographic Details
Published inIEEE transaction on neural networks and learning systems Vol. 36; no. 8; pp. 15436 - 15448
Main Authors Yang, En-Hui, Mohajer Hamidi, Shayan, Ye, Linfeng, Tan, Renhao, Yang, Beverly
Format Journal Article
LanguageEnglish
Published United States IEEE 01.08.2025
Subjects
Online AccessGet full text
ISSN2162-237X
2162-2388
2162-2388
DOI10.1109/TNNLS.2025.3540014

Cover

More Information
Summary:The concepts of conditional mutual information (CMI) and normalized CMI (NCMI) are introduced to measure the concentration and separation performance of a classification deep neural network (DNN) in the output probability distribution space of the DNN, where CMI and the ratio between CMI and NCMI represent the intraclass concentration and interclass separation of the DNN, respectively. By using NCMI to evaluate popular DNNs pretrained over CIFAR-100 and ImageNet in the literature, it is shown that their validation accuracies are more or less inversely proportional to their NCMI values. Based on this observation, the standard deep learning (DL) framework is further modified to minimize the standard cross entropy (CE) function subject to an NCMI constraint, yielding CMI constrained DL (CMIC-DL). A novel alternating learning algorithm is proposed to solve such a constrained optimization problem. Extensive experimental results show that DNNs trained within CMIC-DL outperform the state-of-the-art models trained within the standard DL and other loss functions in the literature in terms of both accuracy and robustness against adversarial attacks. In addition, visualizing the evolution of the learning process through the lens of CMI and NCMI is also advocated.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2162-237X
2162-2388
2162-2388
DOI:10.1109/TNNLS.2025.3540014