harsha163 commited on
Commit
6db14fd
1 Parent(s): d173a49

added description

Browse files
Files changed (1) hide show
  1. README.md +8 -6
README.md CHANGED
@@ -5,17 +5,17 @@ tags:
5
  - Architecture
6
  ---
7
 
8
- ## Model description
9
 
10
- More information needed
11
 
12
- ## Intended uses & limitations
13
 
14
- More information needed
15
 
16
- ## Training and evaluation data
17
 
18
- More information needed
19
 
20
  ## Training procedure
21
 
@@ -36,6 +36,8 @@ The following hyperparameters were used during training:
36
  | exclude_from_weight_decay | None |
37
  | training_precision | float32 |
38
 
 
 
39
 
40
  ## Model Plot
41
 
 
5
  - Architecture
6
  ---
7
 
8
+ # Tensorflow Keras implementation of : [Learning to tokenize in Vision Transformers](https://keras.io/examples/vision/token_learner/)
9
 
10
+ The full credit goes to: [Aritra Roy Gosthipaty](https://twitter.com/ariG23498), [Sayak Paul](https://twitter.com/RisingSayak)
11
 
12
+ ## Short description:
13
 
14
+ ViT and other Transformer based architectures represent the images as patches. As we increase the resolution of the images, the number of patches increase as well. To tackle this, Ryoo et al. introduced a new module called TokenLearner which can help reduce the number of patches used. The full paper can be found [here](https://openreview.net/forum?id=z-l1kpDXs88)
15
 
16
+ ## Model and Dataset used
17
 
18
+ The Dataset used here is CIFAR-10. The model used here is a mini ViT model with the TokenLearner module.
19
 
20
  ## Training procedure
21
 
 
36
  | exclude_from_weight_decay | None |
37
  | training_precision | float32 |
38
 
39
+ ## Training Metrics
40
+ After 20 Epocs, the test accuracy of the model is 55.9% and the Top 5 test accuracy is 95.06%
41
 
42
  ## Model Plot
43