File size: 1,425 Bytes
7293ba5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
## Model description
This is a fine-tuned model based on [apple/mobilevitv2-1.0-imagenet1k-256](https://huggingface.co/apple/mobilevitv2-1.0-imagenet1k-256) trained for sketch image recognition using [Xenova/quickdraw-small](https://huggingface.co/datasets/Xenova/quickdraw-small) dataset.

## How to use?
```
from transformers import MobileViTImageProcessor, MobileViTV2ForImageClassification
from PIL import Image
import requests
import torch
import numpy as np  # Importing NumPy

url = "https://static.thenounproject.com/png/2024184-200.png"
response = requests.get(url, stream=True)

# Convert to grayscale to ensure a single channel input
image = Image.open(response.raw).convert('L')  # Convert to grayscale

processor = MobileViTImageProcessor.from_pretrained("laszlokiss27/doodle-dash2")
model = MobileViTV2ForImageClassification.from_pretrained("laszlokiss27/doodle-dash2")

# Convert the PIL image to a tensor and add a channel dimension
image_tensor = torch.unsqueeze(torch.tensor(np.array(image)), 0).float()
image_tensor = image_tensor.unsqueeze(0)  # Add batch dimension

# Check if processor requires specific form of input
inputs = processor(images=image_tensor, return_tensors="pt")

outputs = model(**inputs)
logits = outputs.logits

# Get prediction
predicted_class_idx = logits.argmax(-1).item()
predicted_class = model.config.id2label[predicted_class_idx]
print("Predicted class:", predicted_class)

```