Disentangling Subject-Irrelevant Elements in Personalized Text-to-Image Diffusion via Filtered Self-Distillation

Recent research has unveiled the development of customizing large-scale text-to-image models. These models bind a unique subject desired by a user to a specific token, using the token to generate the subject in various contexts. However, models from previous studies also bind elements unrelated to t...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings / IEEE Workshop on Applications of Computer Vision pp. 9073 - 9082
Main Authors	Choi, Seunghwan, Yun, Jooyeol, Park, Jeonghoon, Choo, Jaegul
Format	Conference Proceeding
Language	English
Published	IEEE 26.02.2025
Subjects	Analytical models Computational modeling Computer vision Context modeling Filtering Gold Image resolution personalization self-distillation Text to image text-to-image generation Training User experience
Online Access	Get full text
ISSN	2642-9381
DOI	10.1109/WACV61041.2025.00879

Cover

More Information
Summary:	Recent research has unveiled the development of customizing large-scale text-to-image models. These models bind a unique subject desired by a user to a specific token, using the token to generate the subject in various contexts. However, models from previous studies also bind elements unrelated to the subject's identity, such as common backgrounds or poses in the reference images. This often leads to conflicts between the token and the context of text prompts during inference, causing the model to fail to generate both the subject and the prompted context. In this work, we approach this issue from a data scarcity perspective and propose to augment the number of reference images through a novel self-distillation framework. Our framework selects high-quality samples from images generated by a teacher model and uses them in student training. Our framework can be applied to any models that suffer from the conflicts, and we demonstrate that our framework most effectively resolves the issue through comprehensive evaluations.
ISSN:	2642-9381
DOI:	10.1109/WACV61041.2025.00879