Model Card for VAR (Visual AutoRegressive) Transformers 🔥

VAR is a new visual generation framework that makes GPT-style models surpass diffusion models for the first time🚀, and exhibits clear power-law Scaling Laws📈 like large language models (LLMs).

VAR redefines the autoregressive learning on images as coarse-to-fine "next-scale prediction" or "next-resolution prediction", diverging from the standard raster-scan "next-token prediction".

This repo is used for hosting VAR's checkpoints.

For more details or tutorials see https://github.com/FoundationVision/VAR.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for FoundationVision/var

Finetunes

2 models

Dataset used to train FoundationVision/var

Spaces using FoundationVision/var 3

Paper for FoundationVision/var

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

Paper • 2404.02905 • Published Apr 3, 2024 • 74