wkcn commited on
Commit
f221fb4
·
1 Parent(s): b85bc5a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -26
README.md CHANGED
@@ -8,7 +8,7 @@ tags:
8
  ---
9
  # TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance
10
 
11
- :pushpin: This is an official PyTorch implementation of **[ICCV 2023]** - [TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance](https://openaccess.thecvf.com/content/ICCV2023/html/Wu_TinyCLIP_CLIP_Distillation_via_Affinity_Mimicking_and_Weight_Inheritance_ICCV_2023_paper.html)
12
 
13
  **TinyCLIP** is a novel **cross-modal distillation** method for large-scale language-image pre-trained models. The method introduces two core techniques: **affinity mimicking** and **weight inheritance**. This work unleashes the capacity of small CLIP models, fully leveraging large-scale models as well as pre-training data and striking the best trade-off between speed and accuracy.
14
 
@@ -19,7 +19,7 @@ tags:
19
 
20
  ## Use with Transformers
21
 
22
- ```python3
23
  from PIL import Image
24
  import requests
25
 
@@ -38,21 +38,13 @@ logits_per_image = outputs.logits_per_image # this is the image-text similarity
38
  probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities
39
  ```
40
 
41
-
42
-
43
-
44
-
45
  ## Highlights
46
  <p align="center">
47
  <img src="./figure/fig1.jpg" width="500">
48
  </p>
49
 
50
  * TinyCLIP ViT-45M/32 uses only **half parameters** of ViT-B/32 to achieves **comparable zero-shot performance**.
51
- * TinyCLIP ResNet-19M reduces the parameters by **50\%** while getting **$2\times$** inference speedup, and obtains **56.4\%** accuracy on ImageNet.
52
-
53
- ## News
54
- * *Oct.2023* Training code is released.
55
- * *Sep.2023* This is preliminary released code, including inference code and checkpoints.
56
 
57
  ## Model Zoo
58
  | Model | Weight inheritance | Pretrain | IN-1K Acc@1(%) | MACs(G) | Throughput(pairs/s) | Link |
@@ -71,20 +63,8 @@ TinyCLIP ViT-45M/32 Text-18M | auto | LAION+YFCC-400M | 62.7 | 1.9 | 3,685 | [M
71
 
72
  Note: The configs of models with auto inheritance are generated automatically.
73
 
74
- ## Getting Started
75
- :beginner: Here is the setup tutorial, evaluation and pretraining scripts.
76
-
77
- ### Install dependencies and prepare dataset
78
- - [Preparation](./docs/PREPARATION.md)
79
-
80
- ### Evaluate it
81
- - [Evaluation](./docs/EVALUATION.md)
82
-
83
- ### An example for inference
84
- - [Inference](./inference.py)
85
-
86
- ### Pretrain it
87
- - [Pretraining](./docs/PRETRAINING.md)
88
 
89
  ## Citation
90
  If this repo is helpful for you, please consider to cite it. :mega: Thank you! :)
@@ -105,4 +85,4 @@ If this repo is helpful for you, please consider to cite it. :mega: Thank you! :
105
  Our code is based on [CLIP](https://github.com/openai/CLIP), [OpenCLIP](https://github.com/mlfoundations/open_clip), [CoFi](https://github.com/princeton-nlp/CoFiPruning) and [PyTorch](https://github.com/pytorch/pytorch). Thank contributors for their awesome contribution!
106
 
107
  ## License
108
- - [License](./LICENSE)
 
8
  ---
9
  # TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance
10
 
11
+ **[ICCV 2023]** - [TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance](https://openaccess.thecvf.com/content/ICCV2023/html/Wu_TinyCLIP_CLIP_Distillation_via_Affinity_Mimicking_and_Weight_Inheritance_ICCV_2023_paper.html)
12
 
13
  **TinyCLIP** is a novel **cross-modal distillation** method for large-scale language-image pre-trained models. The method introduces two core techniques: **affinity mimicking** and **weight inheritance**. This work unleashes the capacity of small CLIP models, fully leveraging large-scale models as well as pre-training data and striking the best trade-off between speed and accuracy.
14
 
 
19
 
20
  ## Use with Transformers
21
 
22
+ ```python
23
  from PIL import Image
24
  import requests
25
 
 
38
  probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities
39
  ```
40
 
 
 
 
 
41
  ## Highlights
42
  <p align="center">
43
  <img src="./figure/fig1.jpg" width="500">
44
  </p>
45
 
46
  * TinyCLIP ViT-45M/32 uses only **half parameters** of ViT-B/32 to achieves **comparable zero-shot performance**.
47
+ * TinyCLIP ResNet-19M reduces the parameters by **50\%** while getting **2x** inference speedup, and obtains **56.4\%** accuracy on ImageNet.
 
 
 
 
48
 
49
  ## Model Zoo
50
  | Model | Weight inheritance | Pretrain | IN-1K Acc@1(%) | MACs(G) | Throughput(pairs/s) | Link |
 
63
 
64
  Note: The configs of models with auto inheritance are generated automatically.
65
 
66
+ ## Official PyTorch Implementation
67
+ https://github.com/microsoft/Cream/tree/main/TinyCLIP
 
 
 
 
 
 
 
 
 
 
 
 
68
 
69
  ## Citation
70
  If this repo is helpful for you, please consider to cite it. :mega: Thank you! :)
 
85
  Our code is based on [CLIP](https://github.com/openai/CLIP), [OpenCLIP](https://github.com/mlfoundations/open_clip), [CoFi](https://github.com/princeton-nlp/CoFiPruning) and [PyTorch](https://github.com/pytorch/pytorch). Thank contributors for their awesome contribution!
86
 
87
  ## License
88
+ - [License](https://github.com/microsoft/Cream/blob/main/TinyCLIP/LICENSE)