Edit model card

AIM: Autoregressive Image Models

Alaaeldin El-Nouby, Michal Klein, Shuangfei Zhai, Miguel Angel Bautista, Alexander Toshev, Vaishaal Shankar, Joshua M Susskind, and Armand Joulin

This software project accompanies the research paper, Scalable Pre-training of Large Autoregressive Image Models.

We introduce AIM a collection of vision models pre-trained with an autoregressive generative objective. We show that autoregressive pre-training of image features exhibits similar scaling properties to their textual counterpart (i.e. Large Language Models). Specifically, we highlight two findings:

  1. the model capacity can be trivially scaled to billions of parameters, and
  2. AIM effectively leverages large collections of uncurated image data.

Installation

Please install PyTorch using the official installation instructions. Afterward, install the package as:

pip install git+https://git@github.com/apple/ml-aim.git

Usage

Below we provide an example of loading the model via HuggingFace Hub as:

from PIL import Image

from aim.torch.models import AIMForImageClassification
from aim.torch.data import val_transforms

img = Image.open(...)
model = AIMForImageClassification.from_pretrained("apple/aim-600M")
transform = val_transforms()

inp = transform(img).unsqueeze(0)
logits, features = model(inp)

ImageNet-1k results (frozen trunk)

The table below contains the classification results on ImageNet-1k validation set.

model top-1 IN-1k
last layer best layer
AIM-0.6B 78.5% 79.4%
AIM-1B 80.6% 82.3%
AIM-3B 82.2% 83.3%
AIM-7B 82.4% 84.0%
Downloads last month
62
Inference API
Drag image file here or click to browse from your device
Inference API (serverless) does not yet support ml-aim models for this pipeline type.

Collection including apple/AIM-600M