keras-io
/

learning_to_tokenize_in_ViT

Image Classification

Model card Files Files and versions

Metrics Training metrics Community

harsha163 commited on Jul 11, 2022

Commit

6db14fd

·

1 Parent(s): d173a49

added description

Files changed (1) hide show

README.md +8 -6

README.md CHANGED Viewed

@@ -5,17 +5,17 @@ tags:
 - Architecture
 ---
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure
@@ -36,6 +36,8 @@ The following hyperparameters were used during training:
 | exclude_from_weight_decay | None |
 | training_precision | float32 |
  ## Model Plot

 - Architecture
 ---
+# Tensorflow Keras implementation of : [Learning to tokenize in Vision Transformers](https://keras.io/examples/vision/token_learner/)
+The full credit goes to: [Aritra Roy Gosthipaty](https://twitter.com/ariG23498), [Sayak Paul](https://twitter.com/RisingSayak)
+## Short description:
+ViT and other Transformer based architectures represent the images as patches. As we increase the resolution of the images, the number of patches increase as well. To tackle this, Ryoo et al. introduced a new module called TokenLearner which can help reduce the number of patches used. The full paper can be found [here](https://openreview.net/forum?id=z-l1kpDXs88)
+## Model and Dataset used
+The Dataset used here is CIFAR-10. The model used here is a mini ViT model with the TokenLearner module.
 ## Training procedure
 | exclude_from_weight_decay | None |
 | training_precision | float32 |
+## Training Metrics
+After 20 Epocs, the test accuracy of the model is 55.9% and the Top 5 test accuracy is 95.06%
  ## Model Plot