Graphcore
/

vit-base-ipu

Model card Files Files and versions Community

Dongsung commited on May 19, 2022

Commit

dc139a4

•

1 Parent(s): 1553ef4

Update model description

Files changed (1) hide show

README.md +8 -0

README.md CHANGED Viewed

@@ -6,6 +6,14 @@ This model contains just the `IPUConfig` files for running the ViT base model (e
 **This model contains no model weights, only an IPUConfig.**
 ## Usage
 ```

 **This model contains no model weights, only an IPUConfig.**
+## Model description
+The Vision Transformer (ViT) is a model for image recognition that employs a Transformer-like architecture over patches of the image which was widely used for NLP pretraining.
+It uses a standard Transformer encoder as used in NLP and simple, yet scalable, strategy works surprisingly well when coupled with pre-training on large amounts of dataset and tranferred to multiple size image recognition benchmarks while requiring substantially fewer computational resources to train.
+Paper link : [AN IMAGE IS WORTH 16X16 WORDS:TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE'](https://arxiv.org/pdf/2010.11929.pdf)
 ## Usage
 ```