Datasets documentation

Image classification

You are viewing v2.10.0 version. A newer version v2.18.0 is available.
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Image classification

Image classification datasets are used to train a model to classify an entire image. There are a wide variety of applications enabled by these datasets such as identifying endangered wildlife species or screening for disease in medical images. This guide will show you how to apply transformations to an image classification dataset.

Before you start, make sure you have up-to-date versions of albumentations and cv2 installed:

pip install -U albumentations opencv-python

This guide uses the Beans dataset for identifying the type of bean plant disease based on an image of its leaf.

Load the dataset and take a look at an example:

>>> from datasets import load_dataset

>>> dataset = load_dataset("beans")
>>> dataset["train"][10]
{'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x500 at 0x7F8D2F4D7A10>,
 'image_file_path': '/root/.cache/huggingface/datasets/downloads/extracted/b0a21163f78769a2cf11f58dfc767fb458fc7cea5c05dccc0144a2c0f0bc1292/train/angular_leaf_spot/angular_leaf_spot_train.204.jpg',
 'labels': 0}

The dataset has three fields:

  • image: a PIL image object.
  • image_file_path: the path to the image file.
  • labels: the label or category of the image.

Next, check out an image:

Now apply some augmentations with albumentations. You’ll randomly crop the image, flip it horizontally, and adjust its brightness.

>>> import cv2
>>> import albumentations
>>> import numpy as np

>>> transform = albumentations.Compose([
...     albumentations.RandomCrop(width=256, height=256),
...     albumentations.HorizontalFlip(p=0.5),
...     albumentations.RandomBrightnessContrast(p=0.2),
... ])

Create a function to apply the transformation to the images:

>>> def transforms(examples):
...     examples["pixel_values"] = [
...         transform(image=np.array(image))["image"] for image in examples["image"]
...     ]
... 
...     return examples

Use the set_transform() function to apply the transformation on-the-fly to batches of the dataset to consume less disk space:

>>> dataset.set_transform(transforms)

You can verify the transformation worked by indexing into the pixel_values of the first example:

>>> import numpy as np
>>> import matplotlib.pyplot as plt

>>> img = dataset["train"][0]["pixel_values"]
>>> plt.imshow(img)

Now that you know how to process a dataset for image classification, learn how to train an image classification model and use it for inference.