mobileclip
fartashf commited on
Commit
a3468db
·
verified ·
1 Parent(s): 61176e3

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +6 -2
README.md CHANGED
@@ -7,6 +7,7 @@ library_name: mobileclip
7
 
8
  MobileCLIP2 was introduced in [MobileCLIP2: Improving Multi-Modal Reinforced Training](http://arxiv.org/abs/2508.20691) (TMLR August 2025 <mark>Featured</mark>), by Fartash Faghri, Pavan Kumar Anasosalu Vasu, Cem Koc, Vaishaal Shankar, Alexander T Toshev, Oncel Tuzel, Hadi Pouransari.
9
 
 
10
  This repository contains the **MobileCLIP-S4** checkpoint.
11
 
12
  ![MobileCLIP2 Performance Figure](fig_accuracy_latency_v2.png)
@@ -24,7 +25,7 @@ This repository contains the **MobileCLIP-S4** checkpoint.
24
 
25
  | Model | # Seen <BR>Samples (B) | # Params (M) <BR> (img + txt) | Latency (ms) <BR> (img + txt) | IN-1k Zero-Shot <BR> Top-1 Acc. (%) | Avg. Perf. (%) <BR> on 38 datasets |
26
  |:----------------------------------------------------------|:----------------------:|:-----------------------------:|:-----------------------------:|:-----------------------------------:|:----------------------------------:|
27
- | [MobileCLIP2-S0](https://hf.co/apple/MobileCLIP2-S0) | 13 | 11.4 + 42.4 | 1.5 + 1.6 | 71.5 | 59.7 |
28
  | [MobileCLIP2-S2](https://hf.co/apple/MobileCLIP2-S2) | 13 | 35.7 + 63.4 | 3.6 + 3.3 | 77.2 | 64.1 |
29
  | [MobileCLIP2-B](https://hf.co/apple/MobileCLIP2-B) | 13 | 86.3 + 63.4 | 10.4 + 3.3 | 79.4 | 65.8 |
30
  | [MobileCLIP2-S3](https://hf.co/apple/MobileCLIP2-S3) | 13 | 125.1 + 123.6 | 8.0 + 6.6 | 80.7 | 66.8 |
@@ -61,8 +62,11 @@ from mobileclip.modules.common.mobileone import reparameterize_model
61
  model, _, preprocess = open_clip.create_model_and_transforms('MobileCLIP2-S4', pretrained='/path/to/mobileclip_s4.pt')
62
  tokenizer = open_clip.get_tokenizer('MobileCLIP2-S4')
63
 
 
 
 
64
  # For inference/model exporting purposes, please reparameterize first
65
- model = reparameterize_model(model.eval())
66
 
67
  image = preprocess(Image.open("docs/fig_accuracy_latency.png").convert('RGB')).unsqueeze(0)
68
  text = tokenizer(["a diagram", "a dog", "a cat"])
 
7
 
8
  MobileCLIP2 was introduced in [MobileCLIP2: Improving Multi-Modal Reinforced Training](http://arxiv.org/abs/2508.20691) (TMLR August 2025 <mark>Featured</mark>), by Fartash Faghri, Pavan Kumar Anasosalu Vasu, Cem Koc, Vaishaal Shankar, Alexander T Toshev, Oncel Tuzel, Hadi Pouransari.
9
 
10
+
11
  This repository contains the **MobileCLIP-S4** checkpoint.
12
 
13
  ![MobileCLIP2 Performance Figure](fig_accuracy_latency_v2.png)
 
25
 
26
  | Model | # Seen <BR>Samples (B) | # Params (M) <BR> (img + txt) | Latency (ms) <BR> (img + txt) | IN-1k Zero-Shot <BR> Top-1 Acc. (%) | Avg. Perf. (%) <BR> on 38 datasets |
27
  |:----------------------------------------------------------|:----------------------:|:-----------------------------:|:-----------------------------:|:-----------------------------------:|:----------------------------------:|
28
+ | [MobileCLIP2-S0](https://hf.co/apple/MobileCLIP2-S0) | 13 | 11.4 + 63.4 | 1.5 + 3.3 | 71.5 | 59.7 |
29
  | [MobileCLIP2-S2](https://hf.co/apple/MobileCLIP2-S2) | 13 | 35.7 + 63.4 | 3.6 + 3.3 | 77.2 | 64.1 |
30
  | [MobileCLIP2-B](https://hf.co/apple/MobileCLIP2-B) | 13 | 86.3 + 63.4 | 10.4 + 3.3 | 79.4 | 65.8 |
31
  | [MobileCLIP2-S3](https://hf.co/apple/MobileCLIP2-S3) | 13 | 125.1 + 123.6 | 8.0 + 6.6 | 80.7 | 66.8 |
 
62
  model, _, preprocess = open_clip.create_model_and_transforms('MobileCLIP2-S4', pretrained='/path/to/mobileclip_s4.pt')
63
  tokenizer = open_clip.get_tokenizer('MobileCLIP2-S4')
64
 
65
+ # Model needs to be in eval mode for inference because of batchnorm layers unlike ViTs
66
+ model.eval()
67
+
68
  # For inference/model exporting purposes, please reparameterize first
69
+ model = reparameterize_model(model)
70
 
71
  image = preprocess(Image.open("docs/fig_accuracy_latency.png").convert('RGB')).unsqueeze(0)
72
  text = tokenizer(["a diagram", "a dog", "a cat"])