On Mutual Information in Contrastive Learning for Visual Representations
In recent years, several unsupervised, "contrastive" learning algorithms in vision have been shown to learn representations that perform remarkably well on transfer tasks. We show that this family of algorithms maximizes a lower bound on the mutual information between two or more "vie...
Saved in:
| Main Authors | , , , , |
|---|---|
| Format | Journal Article |
| Language | English |
| Published |
27.05.2020
|
| Subjects | |
| Online Access | Get full text |
| DOI | 10.48550/arxiv.2005.13149 |
Cover
| Summary: | In recent years, several unsupervised, "contrastive" learning algorithms in
vision have been shown to learn representations that perform remarkably well on
transfer tasks. We show that this family of algorithms maximizes a lower bound
on the mutual information between two or more "views" of an image where typical
views come from a composition of image augmentations. Our bound generalizes the
InfoNCE objective to support negative sampling from a restricted region of
"difficult" contrasts. We find that the choice of negative samples and views
are critical to the success of these algorithms. Reformulating previous
learning objectives in terms of mutual information also simplifies and
stabilizes them. In practice, our new objectives yield representations that
outperform those learned with previous approaches for transfer to
classification, bounding box detection, instance segmentation, and keypoint
detection. % experiments show that choosing more difficult negative samples
results in a stronger representation, outperforming those learned with IR, LA,
and CMC in classification, bounding box detection, instance segmentation, and
keypoint detection. The mutual information framework provides a unifying
comparison of approaches to contrastive learning and uncovers the choices that
impact representation learning. |
|---|---|
| DOI: | 10.48550/arxiv.2005.13149 |