yttdebaba commited on
Commit
0dd4b3f
·
verified ·
1 Parent(s): 6b9cc7b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -3
README.md CHANGED
@@ -1,3 +1,57 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ ## FDViT: Improve the Hierarchical Architecture of Vision Transformer (ICCV 2023)
5
+
6
+ **Yixing Xu, Chao Li, Dong Li, Xiao Sheng, Fan Jiang, Lu Tian, Ashish Sirasao** | [Paper](https://openaccess.thecvf.com/content/ICCV2023/papers/Xu_FDViT_Improve_the_Hierarchical_Architecture_of_Vision_Transformer_ICCV_2023_paper.pdf)
7
+
8
+ Advanced Micro Devices, Inc.
9
+
10
+ ---
11
+
12
+ ## Dependancies
13
+
14
+ ```bash
15
+ torch == 1.13.1
16
+ torchvision == 0.14.1
17
+ timm == 0.6.12
18
+ einops == 0.6.1
19
+ ```
20
+
21
+ ## Model performance
22
+
23
+ The image classification results of FDViT models on ImageNet dataset are shown in the following table.
24
+
25
+ |Model|Parameters (M)|FLOPs(G)|Top-1 Accuracy (%)|
26
+ |-|-|-|-|
27
+ |FDViT-Ti|4.6|0.6|73.74|
28
+ |FDViT-S|21.6|2.8|81.45|
29
+ |FDViT-B|68.1|11.9|82.39|
30
+
31
+ ## Model Usage
32
+
33
+ ```bash
34
+ from transformers import AutoModelForImageClassification
35
+ import torch
36
+
37
+ model = AutoModelForImageClassification.from_pretrained("FDViT_ti", trust_remote_code=True)
38
+
39
+ model.eval()
40
+
41
+ inp = torch.ones(1,3,224,224)
42
+ out = model(inp)
43
+ # print(out.logits)
44
+ print(torch.sum(out.logits))
45
+ ```
46
+
47
+ ## Citation
48
+
49
+ ```
50
+ @inproceedings{xu2023fdvit,
51
+ title={FDViT: Improve the Hierarchical Architecture of Vision Transformer},
52
+ author={Xu, Yixing and Li, Chao and Li, Dong and Sheng, Xiao and Jiang, Fan and Tian, Lu and Sirasao, Ashish},
53
+ booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
54
+ pages={5950--5960},
55
+ year={2023}
56
+ }
57
+ ```