English

This is the model release of the paper

Elucidating the design space of language models for image generation

You may check the paper: arXiv, code: Github

We provide 4 Binary-Autoencoder (BAE) tokenizers, following Binary Latent Diffusion, with code dimension 16, 10, 24 and 32, each trained for 1,000,000 iterations with batch size 256.

Code Dim Bernoulli Sampling Link Size
16 โœ… link 332MB
16 โŒ link 332MB
20 โœ… link 332MB
24 โœ… link 332MB

The generation model architecture is adapted from Llama2, following LlameGen.

Model Link Size
AR-L [1-16] [2-8] [2-10] [2-12] 1.25GB~1.77GB
AR-XL [1-16] [2-8] [2-10] [2-12] 2.95GB~3.6GB
AR-XXL [1-16] [2-10] [2-12] 5.49GB~6.25GB
AR-2B [2-12] 7.64GB
MLM-L [1-16] 1.51GB
MLM-XL [1-16] 3.27GB
MLM-XXL [1-16] 5.86GB
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for xuantonglll/ELM

Unable to build the model tree, the base model loops to the model itself. Learn more.

Dataset used to train xuantonglll/ELM