Remove unncessary " ' " in paper link
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ The Vision Transformer (ViT) is a model for image recognition that employs a Tra
|
|
12 |
|
13 |
It uses a standard Transformer encoder as used in NLP and simple, yet scalable, strategy works surprisingly well when coupled with pre-training on large amounts of dataset and tranferred to multiple size image recognition benchmarks while requiring substantially fewer computational resources to train.
|
14 |
|
15 |
-
Paper link : [AN IMAGE IS WORTH 16X16 WORDS:TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE
|
16 |
|
17 |
## Usage
|
18 |
|
|
|
12 |
|
13 |
It uses a standard Transformer encoder as used in NLP and simple, yet scalable, strategy works surprisingly well when coupled with pre-training on large amounts of dataset and tranferred to multiple size image recognition benchmarks while requiring substantially fewer computational resources to train.
|
14 |
|
15 |
+
Paper link : [AN IMAGE IS WORTH 16X16 WORDS:TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE](https://arxiv.org/pdf/2010.11929.pdf)
|
16 |
|
17 |
## Usage
|
18 |
|