kelvinandreas/vit-traffic-sign-GTSRB

Model Details

This model uses the Vision Transformer (ViT) architecture for classifying traffic signs from the German Traffic Sign Recognition Benchmark (GTSRB). It provides a robust solution for image classification tasks, specifically aimed at recognizing traffic signs across 43 different classes.

Model Description

Developed by: Kelvin Andreas
Model type: Vision Transformer (ViT)
Finetuned from model: google/vit-base-patch16-224-in21k
Repository: https://huggingface.co/kelvinandreas/vit-traffic-sign-GTSRB
Demo: https://huggingface.co/spaces/kelvinandreas/traffic-sign-classification

How to Get Started with the Model

To use the model, follow these steps:

Install the required dependencies:
```
pip install transformers torch
```

Load the model and processor:

from transformers import ViTForImageClassification, ViTImageProcessor
import torch
from PIL import Image

processor = ViTImageProcessor.from_pretrained("kelvinandreas/vit-traffic-sign-GTSRB")
model = ViTForImageClassification.from_pretrained("kelvinandreas/vit-traffic-sign-GTSRB")

# Load and process image
image = Image.open("path_to_image.jpg")
inputs = processor(images=image, return_tensors="pt")

# Make prediction
outputs = model(**inputs)
logits = outputs.logits
predicted_class_idx = torch.argmax(logits, dim=-1)
print(predicted_class_idx)

Results

The model performance on the GTSRB dataset is as follows:

Accuracy: 0.9846
Precision: 0.9853
Recall: 0.9846
F1 Score: 0.9846

kelvinandreas
/

vit-traffic-sign-GTSRB

Model Details

Model Description

How to Get Started with the Model

Results

Model tree for kelvinandreas/vit-traffic-sign-GTSRB

Space using kelvinandreas/vit-traffic-sign-GTSRB 1