timm
/

Image Classification
timm
PyTorch
Safetensors
rwightman HF staff commited on
Commit
1fc10a2
1 Parent(s): 212100a

Update model config and README

Browse files
Files changed (1) hide show
  1. README.md +26 -4
README.md CHANGED
@@ -11,9 +11,9 @@ datasets:
11
 
12
  A ResNet image classification model. Trained on ImageNet-1k by Ross Wightman.
13
 
14
- Trained with `timm` scripts using hyper-parameters inspired by the MobileNet-V4 paper with `timm` enhancements.
15
-
16
 
 
17
 
18
  ## Model Details
19
  - **Model Type:** Image classification / feature backbone
@@ -25,6 +25,7 @@ Trained with `timm` scripts using hyper-parameters inspired by the MobileNet-V4
25
  - **Dataset:** ImageNet-1k
26
  - **Papers:**
27
  - PyTorch Image Models: https://github.com/huggingface/pytorch-image-models
 
28
  - Deep Residual Learning for Image Recognition: https://arxiv.org/abs/1512.03385
29
  - MobileNetV4 -- Universal Models for the Mobile Ecosystem: https://arxiv.org/abs/2404.10518
30
 
@@ -157,22 +158,36 @@ output = model.forward_head(output, pre_logits=True)
157
  | [mobilenet_edgetpu_v2_m.ra4_e3600_r224_in1k](http://hf.co/timm/mobilenet_edgetpu_v2_m.ra4_e3600_r224_in1k) | 80.130 | 95.002 | 8.46 | 224 |
158
  | [mobilenetv4_conv_medium.e500_r256_in1k](http://hf.co/timm/mobilenetv4_conv_medium.e500_r256_in1k) | 79.928 | 95.184 | 9.72 | 256 |
159
  | [mobilenetv4_conv_medium.e500_r224_in1k](http://hf.co/timm/mobilenetv4_conv_medium.e500_r224_in1k) | 79.808 | 95.186 | 9.72 | 256 |
 
160
  | [mobilenetv4_conv_blur_medium.e500_r224_in1k](http://hf.co/timm/mobilenetv4_conv_blur_medium.e500_r224_in1k) | 79.438 | 94.932 | 9.72 | 224 |
161
  | [efficientnet_b0.ra4_e3600_r224_in1k](http://hf.co/timm/efficientnet_b0.ra4_e3600_r224_in1k) | 79.364 | 94.754 | 5.29 | 256 |
162
  | [mobilenetv4_conv_medium.e500_r224_in1k](http://hf.co/timm/mobilenetv4_conv_medium.e500_r224_in1k) | 79.094 | 94.77 | 9.72 | 224 |
 
 
163
  | [efficientnet_b0.ra4_e3600_r224_in1k](http://hf.co/timm/efficientnet_b0.ra4_e3600_r224_in1k) | 78.584 | 94.338 | 5.29 | 224 |
 
 
164
  | [mobilenetv1_125.ra4_e3600_r224_in1k](http://hf.co/timm/mobilenetv1_125.ra4_e3600_r224_in1k) | 77.600 | 93.804 | 6.27 | 256 |
165
- | [mobilenetv3_large_100.ra4_e3600_r224_in1k](http://hf.co/timm/mobilenetv3_large_100.ra4_e3600_r224_in1k) | 77.164 | 93.336 | 5.48 | 256 |
 
166
  | [mobilenetv1_125.ra4_e3600_r224_in1k](http://hf.co/timm/mobilenetv1_125.ra4_e3600_r224_in1k) | 76.924 | 93.234 | 6.27 | 224 |
167
  | [mobilenetv1_100h.ra4_e3600_r224_in1k](http://hf.co/timm/mobilenetv1_100h.ra4_e3600_r224_in1k) | 76.596 | 93.272 | 5.28 | 256 |
168
  | [mobilenetv3_large_100.ra4_e3600_r224_in1k](http://hf.co/timm/mobilenetv3_large_100.ra4_e3600_r224_in1k) | 76.310 | 92.846 | 5.48 | 224 |
169
  | [mobilenetv1_100.ra4_e3600_r224_in1k](http://hf.co/timm/mobilenetv1_100.ra4_e3600_r224_in1k) | 76.094 | 93.004 | 4.23 | 256 |
 
 
170
  | [mobilenetv1_100h.ra4_e3600_r224_in1k](http://hf.co/timm/mobilenetv1_100h.ra4_e3600_r224_in1k) | 75.662 | 92.504 | 5.28 | 224 |
171
  | [mobilenetv1_100.ra4_e3600_r224_in1k](http://hf.co/timm/mobilenetv1_100.ra4_e3600_r224_in1k) | 75.382 | 92.312 | 4.23 | 224 |
172
- | [mobilenetv4_conv_small.e2400_r224_in1k](http://hf.co/timm/mobilenetv4_conv_small.e2400_r224_in1k) | 74.616 | 92.072 | 3.77 | 256 |
 
 
 
173
  | [mobilenetv4_conv_small.e1200_r224_in1k](http://hf.co/timm/mobilenetv4_conv_small.e1200_r224_in1k) | 74.292 | 92.116 | 3.77 | 256 |
174
  | [mobilenetv4_conv_small.e2400_r224_in1k](http://hf.co/timm/mobilenetv4_conv_small.e2400_r224_in1k) | 73.756 | 91.422 | 3.77 | 224 |
 
175
  | [mobilenetv4_conv_small.e1200_r224_in1k](http://hf.co/timm/mobilenetv4_conv_small.e1200_r224_in1k) | 73.454 | 91.34 | 3.77 | 224 |
 
 
176
 
177
  ## Citation
178
  ```bibtex
@@ -187,6 +202,13 @@ output = model.forward_head(output, pre_logits=True)
187
  }
188
  ```
189
  ```bibtex
 
 
 
 
 
 
 
190
  @article{He2015,
191
  author = {Kaiming He and Xiangyu Zhang and Shaoqing Ren and Jian Sun},
192
  title = {Deep Residual Learning for Image Recognition},
 
11
 
12
  A ResNet image classification model. Trained on ImageNet-1k by Ross Wightman.
13
 
14
+ Trained with `timm` scripts using hyper-parameters inspired by the MobileNet-V4 small, mixed with go-to hparams from `timm` and "ResNet Strikes Back".
 
15
 
16
+ A collection of hparam (timm .yaml config files) for this training series can be found here: https://gist.github.com/rwightman/f6705cb65c03daeebca8aa129b1b94ad
17
 
18
  ## Model Details
19
  - **Model Type:** Image classification / feature backbone
 
25
  - **Dataset:** ImageNet-1k
26
  - **Papers:**
27
  - PyTorch Image Models: https://github.com/huggingface/pytorch-image-models
28
+ - ResNet strikes back: An improved training procedure in timm: https://arxiv.org/abs/2110.00476
29
  - Deep Residual Learning for Image Recognition: https://arxiv.org/abs/1512.03385
30
  - MobileNetV4 -- Universal Models for the Mobile Ecosystem: https://arxiv.org/abs/2404.10518
31
 
 
158
  | [mobilenet_edgetpu_v2_m.ra4_e3600_r224_in1k](http://hf.co/timm/mobilenet_edgetpu_v2_m.ra4_e3600_r224_in1k) | 80.130 | 95.002 | 8.46 | 224 |
159
  | [mobilenetv4_conv_medium.e500_r256_in1k](http://hf.co/timm/mobilenetv4_conv_medium.e500_r256_in1k) | 79.928 | 95.184 | 9.72 | 256 |
160
  | [mobilenetv4_conv_medium.e500_r224_in1k](http://hf.co/timm/mobilenetv4_conv_medium.e500_r224_in1k) | 79.808 | 95.186 | 9.72 | 256 |
161
+ | [resnetv2_34d.ra4_e3600_r224_in1k](http://hf.co/timm/resnetv2_34d.ra4_e3600_r224_in1k) | 79.590 | 94.770 | 21.82 | 288 |
162
  | [mobilenetv4_conv_blur_medium.e500_r224_in1k](http://hf.co/timm/mobilenetv4_conv_blur_medium.e500_r224_in1k) | 79.438 | 94.932 | 9.72 | 224 |
163
  | [efficientnet_b0.ra4_e3600_r224_in1k](http://hf.co/timm/efficientnet_b0.ra4_e3600_r224_in1k) | 79.364 | 94.754 | 5.29 | 256 |
164
  | [mobilenetv4_conv_medium.e500_r224_in1k](http://hf.co/timm/mobilenetv4_conv_medium.e500_r224_in1k) | 79.094 | 94.77 | 9.72 | 224 |
165
+ | [resnetv2_34.ra4_e3600_r224_in1k](http://hf.co/timm/resnetv2_34.ra4_e3600_r224_in1k) | 79.072 | 94.566 | 21.80 | 288 |
166
+ | [resnet34.ra4_e3600_r224_in1k](http://hf.co/timm/resnet34.ra4_e3600_r224_in1k) | 78.952 | 94.450 | 21.80 | 288 |
167
  | [efficientnet_b0.ra4_e3600_r224_in1k](http://hf.co/timm/efficientnet_b0.ra4_e3600_r224_in1k) | 78.584 | 94.338 | 5.29 | 224 |
168
+ | [resnetv2_34d.ra4_e3600_r224_in1k](http://hf.co/timm/resnetv2_34d.ra4_e3600_r224_in1k) | 78.268 | 93.952 | 21.82 | 224 |
169
+ | [resnetv2_34.ra4_e3600_r224_in1k](http://hf.co/timm/resnetv2_34.ra4_e3600_r224_in1k) | 77.636 | 93.528 | 21.80 | 224 |
170
  | [mobilenetv1_125.ra4_e3600_r224_in1k](http://hf.co/timm/mobilenetv1_125.ra4_e3600_r224_in1k) | 77.600 | 93.804 | 6.27 | 256 |
171
+ | [resnet34.ra4_e3600_r224_in1k](http://hf.co/timm/resnet34.ra4_e3600_r224_in1k) | 77.448 | 93.502 | 21.80 | 224 |
172
+ | [mobilenetv3_large_100.ra4_e3600_r224_in1k](http://hf.co/timm/mobilenetv3_large_100.ra4_e3600_r224_in1k) | 77.164 | 93.336 | 5.48 | 256 |
173
  | [mobilenetv1_125.ra4_e3600_r224_in1k](http://hf.co/timm/mobilenetv1_125.ra4_e3600_r224_in1k) | 76.924 | 93.234 | 6.27 | 224 |
174
  | [mobilenetv1_100h.ra4_e3600_r224_in1k](http://hf.co/timm/mobilenetv1_100h.ra4_e3600_r224_in1k) | 76.596 | 93.272 | 5.28 | 256 |
175
  | [mobilenetv3_large_100.ra4_e3600_r224_in1k](http://hf.co/timm/mobilenetv3_large_100.ra4_e3600_r224_in1k) | 76.310 | 92.846 | 5.48 | 224 |
176
  | [mobilenetv1_100.ra4_e3600_r224_in1k](http://hf.co/timm/mobilenetv1_100.ra4_e3600_r224_in1k) | 76.094 | 93.004 | 4.23 | 256 |
177
+ | [resnetv2_18d.ra4_e3600_r224_in1k](http://hf.co/timm/resnetv2_18d.ra4_e3600_r224_in1k) | 76.044 | 93.020 | 11.71 | 288 |
178
+ | [resnet18d.ra4_e3600_r224_in1k](http://hf.co/timm/resnet18d.ra4_e3600_r224_in1k) | 76.024 | 92.780 | 11.71 | 288 |
179
  | [mobilenetv1_100h.ra4_e3600_r224_in1k](http://hf.co/timm/mobilenetv1_100h.ra4_e3600_r224_in1k) | 75.662 | 92.504 | 5.28 | 224 |
180
  | [mobilenetv1_100.ra4_e3600_r224_in1k](http://hf.co/timm/mobilenetv1_100.ra4_e3600_r224_in1k) | 75.382 | 92.312 | 4.23 | 224 |
181
+ | [resnetv2_18.ra4_e3600_r224_in1k](http://hf.co/timm/resnetv2_18.ra4_e3600_r224_in1k) | 75.340 | 92.678 | 11.69 | 288 |
182
+ | [mobilenetv4_conv_small.e2400_r224_in1k](http://hf.co/timm/mobilenetv4_conv_small.e2400_r224_in1k) | 74.616 | 92.072 | 3.77 | 256 |
183
+ | [resnetv2_18d.ra4_e3600_r224_in1k](http://hf.co/timm/resnetv2_18d.ra4_e3600_r224_in1k) | 74.412 | 91.936 | 11.71 | 224 |
184
+ | [resnet18d.ra4_e3600_r224_in1k](http://hf.co/timm/resnet18d.ra4_e3600_r224_in1k) | 74.322 | 91.832 | 11.71 | 224 |
185
  | [mobilenetv4_conv_small.e1200_r224_in1k](http://hf.co/timm/mobilenetv4_conv_small.e1200_r224_in1k) | 74.292 | 92.116 | 3.77 | 256 |
186
  | [mobilenetv4_conv_small.e2400_r224_in1k](http://hf.co/timm/mobilenetv4_conv_small.e2400_r224_in1k) | 73.756 | 91.422 | 3.77 | 224 |
187
+ | [resnetv2_18.ra4_e3600_r224_in1k](http://hf.co/timm/resnetv2_18.ra4_e3600_r224_in1k) | 73.578 | 91.352 | 11.69 | 224 |
188
  | [mobilenetv4_conv_small.e1200_r224_in1k](http://hf.co/timm/mobilenetv4_conv_small.e1200_r224_in1k) | 73.454 | 91.34 | 3.77 | 224 |
189
+ | [mobilenetv4_conv_small_050.e3000_r224_in1k](http://hf.co/timm/mobilenetv4_conv_small_050.e3000_r224_in1k) | 65.810 | 86.424 | 2.24 | 256 |
190
+ | [mobilenetv4_conv_small_050.e3000_r224_in1k](http://hf.co/timm/mobilenetv4_conv_small_050.e3000_r224_in1k) | 64.762 | 85.514 | 2.24 | 224 |
191
 
192
  ## Citation
193
  ```bibtex
 
202
  }
203
  ```
204
  ```bibtex
205
+ @inproceedings{wightman2021resnet,
206
+ title={ResNet strikes back: An improved training procedure in timm},
207
+ author={Wightman, Ross and Touvron, Hugo and Jegou, Herve},
208
+ booktitle={NeurIPS 2021 Workshop on ImageNet: Past, Present, and Future}
209
+ }
210
+ ```
211
+ ```bibtex
212
  @article{He2015,
213
  author = {Kaiming He and Xiangyu Zhang and Shaoqing Ren and Jian Sun},
214
  title = {Deep Residual Learning for Image Recognition},