Knowledge Distillation-Based Domain-Invariant Representation Learning for Domain Generalization

Domain generalization (DG) aims to generalize the knowledge learned from multiple source domains to unseen target domains. Existing DG techniques can be subsumed under two broad categories, i.e., domain-invariant representation learning and domain manipulation. Nevertheless, it is extremely difficul...

Full description

Saved in:

Bibliographic Details
Published in	IEEE Transactions on Multimedia Vol. 26; pp. 245 - 255
Main Authors	Niu, Ziwei, Yuan, Junkun, Ma, Xu, Xu, Yingying, Liu, Jing, Chen, Yen-Wei, Tong, Ruofeng, Lin, Lanfen
Format	Journal Article
Language	English Japanese
Published	IEEE 2024 Institute of Electrical and Electronics Engineers (IEEE)
Subjects	Adaptation models Computational modeling Data models Domain generalization domain invariant representation Feature extraction knowledge distillation Predictive models Representation learning Training
Online Access	Get full text
ISSN	1520-9210 1941-0077
DOI	10.1109/TMM.2023.3263549

Cover

More Information
Summary:	Domain generalization (DG) aims to generalize the knowledge learned from multiple source domains to unseen target domains. Existing DG techniques can be subsumed under two broad categories, i.e., domain-invariant representation learning and domain manipulation. Nevertheless, it is extremely difficult to explicitly augment or generate the unseen target data. And when source domain variety increases, developing a domain-invariant model by simply aligning more domain-specific information becomes more challenging. In this article, we propose a simple yet effective method for domain generalization, named Knowledge Distillation based Domain-invariant Representation Learning (KDDRL), that learns domain-invariant representation while encouraging the model to maintain domain-specific features, which recently turned out to be effective for domain generalization. To this end, our method incorporates multiple auxiliary student models and one student leader model to perform a two-stage distillation. In the first-stage distillation, each domain-specific auxiliary student treats the ensemble of other auxiliary students' predictions as a target, which helps to excavate the domain-invariant representation. Also, we present an error removal module to prevent the transfer of faulty information by eliminating incorrect predictions compared to the true labels. In the second-stage distillation, the student leader model with domain-specific features combines the domain-invariant representation learned from the group of auxiliary students to make the final prediction. Extensive experiments and in-depth analysis on popular DG benchmark datasets demonstrate that our KDDRL significantly outperforms the current state-of-the-art methods.
ISSN:	1520-9210 1941-0077
DOI:	10.1109/TMM.2023.3263549