wkcn
/

TinyCLIP-ViT-40M-32-Text-19M-LAION400M

@@ -8,7 +8,7 @@ tags:
 ---
 # TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance
-:pushpin: This is an official PyTorch implementation of **[ICCV 2023]** - [TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance](https://openaccess.thecvf.com/content/ICCV2023/html/Wu_TinyCLIP_CLIP_Distillation_via_Affinity_Mimicking_and_Weight_Inheritance_ICCV_2023_paper.html)
 **TinyCLIP** is a novel **cross-modal distillation** method for large-scale language-image pre-trained models. The method introduces two core techniques: **affinity mimicking** and **weight inheritance**. This work unleashes the capacity of small CLIP models, fully leveraging large-scale models as well as pre-training data and striking the best trade-off between speed and accuracy.
@@ -19,7 +19,7 @@ tags:
 ## Use with Transformers
-```python3
 from PIL import Image
 import requests
@@ -38,21 +38,13 @@ logits_per_image = outputs.logits_per_image # this is the image-text similarity
 probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities
 ```
 ## Highlights
 <p align="center">
   <img src="./figure/fig1.jpg" width="500">
 </p>
 * TinyCLIP ViT-45M/32 uses only **half parameters** of ViT-B/32 to achieves **comparable zero-shot performance**.
-* TinyCLIP ResNet-19M reduces the parameters by **50\%** while getting **$2\times$** inference speedup, and obtains **56.4\%** accuracy on ImageNet.
-## News
-* *Oct.2023* Training code is released.
-* *Sep.2023* This is preliminary released code, including inference code and checkpoints.
 ## Model Zoo
 | Model              | Weight inheritance | Pretrain      | IN-1K Acc@1(%) | MACs(G) | Throughput(pairs/s) | Link |
@@ -71,20 +63,8 @@ TinyCLIP ViT-45M/32 Text-18M | auto | LAION+YFCC-400M | 62.7  | 1.9 | 3,685 | [M
 Note: The configs of models with auto inheritance are generated automatically.
-## Getting Started
-:beginner: Here is the setup tutorial, evaluation and pretraining scripts.
-### Install dependencies and prepare dataset
-- [Preparation](./docs/PREPARATION.md)
-### Evaluate it
-- [Evaluation](./docs/EVALUATION.md)
-### An example for inference
-- [Inference](./inference.py)
-### Pretrain it
-- [Pretraining](./docs/PRETRAINING.md)
 ## Citation
 If this repo is helpful for you, please consider to cite it. :mega: Thank you! :)
@@ -105,4 +85,4 @@ If this repo is helpful for you, please consider to cite it. :mega: Thank you! :
 Our code is based on [CLIP](https://github.com/openai/CLIP), [OpenCLIP](https://github.com/mlfoundations/open_clip), [CoFi](https://github.com/princeton-nlp/CoFiPruning) and [PyTorch](https://github.com/pytorch/pytorch). Thank contributors for their awesome contribution!
 ## License
-- [License](./LICENSE)

 ---
 # TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance
+**[ICCV 2023]** - [TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance](https://openaccess.thecvf.com/content/ICCV2023/html/Wu_TinyCLIP_CLIP_Distillation_via_Affinity_Mimicking_and_Weight_Inheritance_ICCV_2023_paper.html)
 **TinyCLIP** is a novel **cross-modal distillation** method for large-scale language-image pre-trained models. The method introduces two core techniques: **affinity mimicking** and **weight inheritance**. This work unleashes the capacity of small CLIP models, fully leveraging large-scale models as well as pre-training data and striking the best trade-off between speed and accuracy.
 ## Use with Transformers
+```python
 from PIL import Image
 import requests
 probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities
 ```
 ## Highlights
 <p align="center">
   <img src="./figure/fig1.jpg" width="500">
 </p>
 * TinyCLIP ViT-45M/32 uses only **half parameters** of ViT-B/32 to achieves **comparable zero-shot performance**.
+* TinyCLIP ResNet-19M reduces the parameters by **50\%** while getting **2x** inference speedup, and obtains **56.4\%** accuracy on ImageNet.
 ## Model Zoo
 | Model              | Weight inheritance | Pretrain      | IN-1K Acc@1(%) | MACs(G) | Throughput(pairs/s) | Link |
 Note: The configs of models with auto inheritance are generated automatically.
+## Official PyTorch Implementation
+https://github.com/microsoft/Cream/tree/main/TinyCLIP
 ## Citation
 If this repo is helpful for you, please consider to cite it. :mega: Thank you! :)
 Our code is based on [CLIP](https://github.com/openai/CLIP), [OpenCLIP](https://github.com/mlfoundations/open_clip), [CoFi](https://github.com/princeton-nlp/CoFiPruning) and [PyTorch](https://github.com/pytorch/pytorch). Thank contributors for their awesome contribution!
 ## License
+- [License](https://github.com/microsoft/Cream/blob/main/TinyCLIP/LICENSE)