CIFAR-10 Upside Down Classifier

For the Fatima Fellowship 2022 Coding Challenge, DL for Vision track.

W&B Report

Model Definition

from torch import nn
import timm
from huggingface_hub import PyTorchModelHubMixin

class UpDownEfficientNetB0(nn.Module, PyTorchModelHubMixin):
    """A simple Hub Mixin wrapper for timm EfficientNet-B0. Used to classify whether an image is upright or flipped down, on CIFAR-10."""

    def __init__(self, **kwargs):
        self.base_model = timm.create_model('efficientnet_b0', num_classes=1, drop_rate=0.2, drop_path_rate=0.2)
        self.config = kwargs.pop("config", None)

    def forward(self, input):
        return self.base_model(input)

Loading the Model from Hub

net = UpDownEfficientNetB0.from_pretrained("ID56/FF-Vision-CIFAR")

Running Inference

from torchvision import transforms

CIFAR_MEAN = (0.4914, 0.4822, 0.4465)
CIFAR_STD = (0.247, 0.243, 0.261)

transform = transforms.Compose([
    transforms.Resize(40, 40),
    transforms.Normalize(CIFAR_MEAN, CIFAR_STD)

image = load_some_image()  # Load some PIL Image or uint8 HWC image array
image = transform(image)   # Convert to CHW image tensor
image = image.unsqueeze(0) # Add batch dimension


pred = net(image)
Dataset used to train ID56/FF-Vision-CIFAR