Model Details
This model uses the Vision Transformer (ViT) architecture for classifying traffic signs from the German Traffic Sign Recognition Benchmark (GTSRB). It provides a robust solution for image classification tasks, specifically aimed at recognizing traffic signs across 43 different classes.
Model Description
- Developed by: Kelvin Andreas
- Model type: Vision Transformer (ViT)
- Finetuned from model: google/vit-base-patch16-224-in21k
- Repository: https://huggingface.co/kelvinandreas/vit-traffic-sign-GTSRB
- Demo: https://huggingface.co/spaces/kelvinandreas/traffic-sign-classification
How to Get Started with the Model
To use the model, follow these steps:
- Install the required dependencies:
pip install transformers torch
- Load the model and processor:
from transformers import ViTForImageClassification, ViTImageProcessor import torch from PIL import Image processor = ViTImageProcessor.from_pretrained("kelvinandreas/vit-traffic-sign-GTSRB") model = ViTForImageClassification.from_pretrained("kelvinandreas/vit-traffic-sign-GTSRB") # Load and process image image = Image.open("path_to_image.jpg") inputs = processor(images=image, return_tensors="pt") # Make prediction outputs = model(**inputs) logits = outputs.logits predicted_class_idx = torch.argmax(logits, dim=-1) print(predicted_class_idx)
Results
The model performance on the GTSRB dataset is as follows:
- Accuracy: 0.9846
- Precision: 0.9853
- Recall: 0.9846
- F1 Score: 0.9846
- Downloads last month
- 47
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
HF Inference deployability: The model has no library tag.
Model tree for kelvinandreas/vit-traffic-sign-GTSRB
Base model
google/vit-base-patch16-224-in21k