File size: 1,243 Bytes
cbb0aea c3231ac d96b454 c3231ac 1e08943 d96b454 09cdd50 d96b454 7a124f1 a81ef09 eb4d7ca 2126d1e 09cdd50 1e539d7 09cdd50 a81ef09 d96b454 09cdd50 d96b454 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
---
license: apache-2.0
---
# ViT Fine-tuned on Stanford Car Dataset
Base model: https://huggingface.co/google/vit-base-patch16-224
This achieves around 86% on the testing set, you can use it as a baseline for further tuning.
# Dataset Description
The Stanford car dataset contains 16,185 images of 196 classes of cars. Classes are typically at the level of Make, Model, Year, e.g. 2012 Tesla Model S or 2012 BMW M3 coupe. The data is split into 8144 training images, 6,041 testing images, and 2000 validation images in this case.
** Please note: this dataset does not contain newer car models **
# Using the Model in the Transformer Library
```
from transformers import AutoFeatureExtractor, AutoModelForImageClassification
extractor = AutoFeatureExtractor.from_pretrained("therealcyberlord/stanford-car-vit-patch16")
model = AutoModelForImageClassification.from_pretrained("therealcyberlord/stanford-car-vit-patch16")
```
<img src="https://ai.stanford.edu/~jkrause/cars/class_montage.jpg">
# Citations
3D Object Representations for Fine-Grained Categorization
Jonathan Krause, Michael Stark, Jia Deng, Li Fei-Fei
4th IEEE Workshop on 3D Representation and Recognition, at ICCV 2013 (3dRR-13). Sydney, Australia. Dec. 8, 2013.
|