Spaces:
Build error
Unified Contrastive Learning in Image-Text-Label Space
"Unifiled Contrastive Learning in Image-Text-Label Space. CVPR 2022" by Jianwei Yang*, Chunyuan Li*, Pengchuan Zhang*, Bin Xiao*, Ce Liu, Lu Yuan and Jianfeng Gao.
Motivation
In this paper, we introduce a new perspective on commonly used image-label and image-text data by residing them in an image-text-label space. In this space, a new learning paradigm, called Unified Contrastive Learning (UniCL) with a single learning objective is proposed to seamlessly prompt the synergy of two data types. We demonstrate that UniCL is an effective way of learning semantically rich yet discriminative representations, universally for image recognition in zero-shot, linear-probe, fully finetuning and transfer learning scenarios. When scaled up to billions of data, UniCL can exclusively learn a powerful visual-semantic representation supporting dozens of downstream tasks shown in Florence.
Benchmarking
Image-label training augmented by image-text pairs
Model | Training Set | Top-1 on IN-1K | ZS on 14 datasets | Download |
---|---|---|---|---|
Swin-T | IN-1K | 79.9 | 30.2 | ckpt/config |
Swin-T | IN-1K + GCC-3M | 80.2 | 39.0 | ckpt/config |
Swin-T | IN-1K + GYFCC-14M | 81.1 | 40.0 | ckpt/config |
Swin-T | IN-1K + GCC-15M | 81.8 | 45.1 | ckpt/config |
Note that all the above models are trained without strong data augmentations like mixup and cutmix.
Image-text learning augmented by image-label data
Model | Training Set | ZS on IN-1K | ZS on 14 datasets | Download |
---|---|---|---|---|
Swin-T | YFCC-14M | 30.1 | 36.3 | ckpt/config |
Swin-T | IN-21K | 28.5 | 37.8 | ckpt/config |
Swin-T | IN-21K (half) + YFCC-14M (half) | 36.4 | 45.5 | ckpt/config |
Swin-T | IN-21K + YFCC-14M | 40.5 | 49.1 | ckpt/config |
Swin-B | YFCC-14M | 37.8 | - | ckpt/config |
Swin-B | IN-21K | 29.9 | 42.4 | ckpt/config |
Swin-B | IN-21K (half) + YFCC-14M (half) | 41.1 | 48.5 | ckpt/config |
Swin-B | IN-21K + YFCC-14M | 44.3 | 52.2 | ckpt/config |
Swin-B | IN-21K + YFCC-14M + GCC-15M | 57.9 | - | ckpt/config |