rwightman HF staff commited on
Commit
1774962
1 Parent(s): fd18c70
Files changed (4) hide show
  1. README.md +105 -0
  2. config.json +33 -0
  3. model.safetensors +3 -0
  4. pytorch_model.bin +3 -0
README.md ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - image-classification
4
+ - timm
5
+ library_tag: timm
6
+ license: apache-2.0
7
+ datasets:
8
+ - imagenet-1k
9
+ - imagenet-22k
10
+ ---
11
+ # Model card for deit3_base_patch16_384.fb_in22k_ft_in1k
12
+
13
+ A DeiT-III image classification model. Pretrained on ImageNet-22k and fine-tuned on ImageNet-1k by paper authors.
14
+
15
+ ## Model Details
16
+ - **Model Type:** Image classification / feature backbone
17
+ - **Model Stats:**
18
+ - Params (M): 86.9
19
+ - GMACs: 55.5
20
+ - Activations (M): 101.6
21
+ - Image size: 384 x 384
22
+ - **Papers:**
23
+ - DeiT III: Revenge of the ViT: https://arxiv.org/abs/2204.07118
24
+ - **Original:** https://github.com/facebookresearch/deit
25
+ - **Dataset:** ImageNet-1k
26
+ - **Pretrain Dataset:** ImageNet-22k
27
+
28
+ ## Model Usage
29
+ ### Image Classification
30
+ ```python
31
+ from urllib.request import urlopen
32
+ from PIL import Image
33
+ import timm
34
+
35
+ img = Image.open(urlopen(
36
+ 'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
37
+ ))
38
+
39
+ model = timm.create_model('deit3_base_patch16_384.fb_in22k_ft_in1k', pretrained=True)
40
+ model = model.eval()
41
+
42
+ # get model specific transforms (normalization, resize)
43
+ data_config = timm.data.resolve_model_data_config(model)
44
+ transforms = timm.data.create_transform(**data_config, is_training=False)
45
+
46
+ output = model(transforms(img).unsqueeze(0)) # unsqueeze single image into batch of 1
47
+
48
+ top5_probabilities, top5_class_indices = torch.topk(output.softmax(dim=1) * 100, k=5)
49
+ ```
50
+
51
+ ### Image Embeddings
52
+ ```python
53
+ from urllib.request import urlopen
54
+ from PIL import Image
55
+ import timm
56
+
57
+ img = Image.open(urlopen(
58
+ 'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
59
+ ))
60
+
61
+ model = timm.create_model(
62
+ 'deit3_base_patch16_384.fb_in22k_ft_in1k',
63
+ pretrained=True,
64
+ num_classes=0, # remove classifier nn.Linear
65
+ )
66
+ model = model.eval()
67
+
68
+ # get model specific transforms (normalization, resize)
69
+ data_config = timm.data.resolve_model_data_config(model)
70
+ transforms = timm.data.create_transform(**data_config, is_training=False)
71
+
72
+ output = model(transforms(img).unsqueeze(0)) # output is (batch_size, num_features) shaped tensor
73
+
74
+ # or equivalently (without needing to set num_classes=0)
75
+
76
+ output = model.forward_features(transforms(img).unsqueeze(0))
77
+ # output is unpooled, a (1, 577, 768) shaped tensor
78
+
79
+ output = model.forward_head(output, pre_logits=True)
80
+ # output is a (1, num_features) shaped tensor
81
+ ```
82
+
83
+ ## Model Comparison
84
+ Explore the dataset and runtime metrics of this model in timm [model results](https://github.com/huggingface/pytorch-image-models/tree/main/results).
85
+
86
+ ## Citation
87
+ ```bibtex
88
+ @article{Touvron2022DeiTIR,
89
+ title={DeiT III: Revenge of the ViT},
90
+ author={Hugo Touvron and Matthieu Cord and Herve Jegou},
91
+ journal={arXiv preprint arXiv:2204.07118},
92
+ year={2022},
93
+ }
94
+ ```
95
+ ```bibtex
96
+ @misc{rw2019timm,
97
+ author = {Ross Wightman},
98
+ title = {PyTorch Image Models},
99
+ year = {2019},
100
+ publisher = {GitHub},
101
+ journal = {GitHub repository},
102
+ doi = {10.5281/zenodo.4414861},
103
+ howpublished = {\url{https://github.com/huggingface/pytorch-image-models}}
104
+ }
105
+ ```
config.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architecture": "deit3_base_patch16_384",
3
+ "num_classes": 1000,
4
+ "num_features": 768,
5
+ "global_pool": "token",
6
+ "pretrained_cfg": {
7
+ "tag": "fb_in22k_ft_in1k",
8
+ "custom_load": false,
9
+ "input_size": [
10
+ 3,
11
+ 384,
12
+ 384
13
+ ],
14
+ "fixed_input_size": true,
15
+ "interpolation": "bicubic",
16
+ "crop_pct": 1.0,
17
+ "crop_mode": "center",
18
+ "mean": [
19
+ 0.485,
20
+ 0.456,
21
+ 0.406
22
+ ],
23
+ "std": [
24
+ 0.229,
25
+ 0.224,
26
+ 0.225
27
+ ],
28
+ "num_classes": 1000,
29
+ "pool_size": null,
30
+ "first_conv": "patch_embed.proj",
31
+ "classifier": "head"
32
+ }
33
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:444b5280ba1c6f844474b5ed48fc4d1868571eeccb71bbde40d3932010155a48
3
+ size 347524826
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a27441215d04a6abfd77af5371e3a60b2948a4eda29238509dd373b79ea71995
3
+ size 347572173