Image Classification

Image classification is the task of assigning a label or class to an entire image. Images are expected to have only one class for each image.Image classification models take an image as input and return a prediction about which class the image belongs to.

Image Classification Model
Egyptian cat
Tabby cat
Tiger cat

About Image Classification

Use Cases

Image classification models can be used when we are not interested in specific instances of objects with location information or their shape.

Keyword Classification

Image classification models are used widely in stock photography to assign each image a keyword.

Image Search

Models trained in image classification can improve user experience by organizing and categorizing photo galleries on the phone or in the cloud, on multiple keywords or tags.


With the transformers library, you can use the image-classification pipeline to infer with image classification models. You can initialize the pipeline with a model id from the Hub. If you do not provide a model id it will initialize with google/vit-base-patch16-224 by default. When calling the pipeline you just need to specify a path, http link or an image loaded in PIL. You can also provide a top_k parameter which determines how many results it should return.

from transformers import pipeline
clf = pipeline("image-classification")

[{'label': 'tabby cat', 'score': 0.731},

Useful Resources

Creating your own image classifier in just a few minutes

With HuggingPics, you can fine-tune Vision Transformers for anything using images found on the web. This project downloads images of classes defined by you, trains a model, and pushes it to the Hub. You even get to try out the model directly with a working widget in the browser, ready to be shared with all your friends!

Compatible libraries

Keras Timm Transformers
Image Classification demo
Image Classification
Drag image file here or click to browse from your device
This model can be loaded on the Inference API on-demand.
Models for Image Classification Browse Models (477)

Note Strong Image Classification model trained on the ImageNet dataset.

Note Strong Image Classification model trained on the ImageNet dataset.

Datasets for Image Classification

Note Benchmark dataset used for image classification with images that belong to 100 classes.

Metrics for Image Classification
Accuracy is the proportion of correct predictions among the total number of cases processed. It can be computed with: Accuracy = (TP + TN) / (TP + TN + FP + FN) Where: TP: True positive TN: True negative FP: False positive FN: False negative
Recall is the fraction of the positive examples that were correctly labeled by the model as positive. It can be computed with the equation: Recall = TP / (TP + FN) Where TP is the true positives and FN is the false negatives.
Precision is the fraction of correctly labeled positive examples out of all of the examples that were labeled as positive. It is computed via the equation: Precision = TP / (TP + FP) where TP is the True positives (i.e. the examples correctly labeled as positive) and FP is the False positive examples (i.e. the examples incorrectly labeled as positive).
The F1 score is the harmonic mean of the precision and recall. It can be computed with the equation: F1 = 2 * (precision * recall) / (precision + recall)