timm
/

Image Classification
timm
PyTorch
Safetensors
rwightman HF staff commited on
Commit
63d2875
1 Parent(s): c6a284e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -3
README.md CHANGED
@@ -7,10 +7,36 @@ license: cc-by-nc-4.0
7
  ---
8
  # Model card for convnext_large_mlp.laion2b_ft_augreg_inat21
9
 
 
10
 
 
11
 
 
12
 
13
- ## Validation Metrics
 
 
 
 
 
 
 
 
14
 
15
- - 90.95 top-1 @ 448x448
16
- - 90.62 @ 384x384
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  ---
8
  # Model card for convnext_large_mlp.laion2b_ft_augreg_inat21
9
 
10
+ Part of a series of `timm` fine-tune experiments on iNaturalist 2021 competition data (https://github.com/visipedia/inat_comp/tree/master/2021) for higher capacity models.
11
 
12
+ Covering 10,000 species, this dataset and these models are fun to explore via the classification widget with pictures from your backyard, but quite a bit smaller than models you can find on iNaturalist website (https://www.inaturalist.org/blog/75633-a-new-computer-vision-model-v2-1-including-1-770-new-taxa).
13
 
14
+ No extra meta-data was used for training these models (as was the case for the competition), it was a straightfoward fine-tune to explore differences in model pretrain data.
15
 
16
+ | Model | Top-1 | Top-5 | Img Size (Train) | Paper |
17
+ |-------|-------|-------|----------|-------|
18
+ | [eva02_large_patch14_clip_336.merged2b_ft_inat21](https://huggingface.co/timm/eva02_large_patch14_clip_336.merged2b_ft_inat21) | 92.05 | 98.01 | 336 | https://arxiv.org/abs/2303.11331 |
19
+ | [vit_large_patch14_clip_336.datacompxl_ft_augreg_inat21](https://huggingface.co/timm/vit_large_patch14_clip_336.datacompxl_ft_augreg_inat21) | 91.98 | 98.03 | 336 | https://arxiv.org/abs/2304.14108 |
20
+ | [vit_large_patch14_clip_336.laion2b_ft_augreg_inat21](https://huggingface.co/timm/vit_large_patch14_clip_336.laion2b_ft_augreg_inat21) | 91.48 | 97.89 | 336 | https://arxiv.org/abs/2212.07143 |
21
+ | [convnext_large_mlp.laion2b_ft_augreg_inat21](https://huggingface.co/timm/convnext_large_mlp.laion2b_ft_augreg_inat21) | 90.95 | 97.68 | 448 (384) | |
22
+ | [vit_large_patch14_clip_336.datacompxl_ft_inat21](https://huggingface.co/timm/vit_large_patch14_clip_336.datacompxl_ft_inat21) | 90.85 | 97.68 | 336 | https://arxiv.org/abs/2304.14108 |
23
+ | [convnext_large_mlp.laion2b_ft_augreg_inat21](https://huggingface.co/timm/convnext_large_mlp.laion2b_ft_augreg_inat21) | 90.62 | 97.61 | 384 | |
24
+ | [vit_large_patch14_clip_336.laion2b_ft_in12k_in1k_inat21](https://huggingface.co/timm/vit_large_patch14_clip_336.laion2b_ft_in12k_in1k_inat21) | 90.29 | 97.44 | 336 | https://arxiv.org/abs/2212.07143 |
25
 
26
+
27
+ ## Run Validation
28
+ ```
29
+ python validate.py /tfds/ --dataset tfds/i_naturalist2021 --model hf-hub:timm/convnext_large_mlp.laion2b_ft_augreg_inat21 --split val --amp
30
+ ```
31
+
32
+ ## Citation
33
+
34
+ ```bibtex
35
+ @inproceedings{cherti2023reproducible,
36
+ title={Reproducible scaling laws for contrastive language-image learning},
37
+ author={Cherti, Mehdi and Beaumont, Romain and Wightman, Ross and Wortsman, Mitchell and Ilharco, Gabriel and Gordon, Cade and Schuhmann, Christoph and Schmidt, Ludwig and Jitsev, Jenia},
38
+ booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
39
+ pages={2818--2829},
40
+ year={2023}
41
+ }
42
+ ```