cvetanovskaa
/

vit-base-patch16-224-in21k-gtsrb-tuned

Image Classification

Inference Endpoints

Model card Files Files and versions Community

cvetanovskaa commited on Nov 27, 2023

Commit

b82f455

•

1 Parent(s): b61042f

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ pipeline_tag: image-classification
 Vision Transformer (ViT) model pre-trained on ImageNet-21k (14 million images, 21,843 classes) at resolution 224x224. It was introduced in the paper [An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale](https://arxiv.org/abs/2010.11929) by Dosovitskiy et al. and first released in [this repository](https://github.com/google-research/vision_transformer). Fine-tuned on the German Traffic Sign Recognition Benchmark Dataset.
 ## Model description
-- Model Architecture: Vision Transformer (ViT) - google/vit-base-patch16-224.
 - Fine-tuning Objective: Classify traffic signs into 43 different categories, including various speed limits, warning signs, and prohibitory or regulatory signs.
 - Developer: Aleksandra Cvetanovska

 Vision Transformer (ViT) model pre-trained on ImageNet-21k (14 million images, 21,843 classes) at resolution 224x224. It was introduced in the paper [An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale](https://arxiv.org/abs/2010.11929) by Dosovitskiy et al. and first released in [this repository](https://github.com/google-research/vision_transformer). Fine-tuned on the German Traffic Sign Recognition Benchmark Dataset.
 ## Model description
+- Model Architecture: Vision Transformer (ViT) - google/vit-base-patch16-224-21k.
 - Fine-tuning Objective: Classify traffic signs into 43 different categories, including various speed limits, warning signs, and prohibitory or regulatory signs.
 - Developer: Aleksandra Cvetanovska