SDXL-Lightning ONNX (WebGPU, 4-bit / small)

Lightweight ONNX build of ByteDance/SDXL-Lightning (4-step) for in-browser inference via onnxruntime-web (WebGPU). The UNet is 4-bit weight-only quantized (MatMulNBits) into a single model_q4.onnx (~1.86 GB). Text encoders, VAE and tokenizers are shared from the fp16 repo (d0gr/sdxl-lightning-onnx-web-fp16).

Used by the Generate AI Images extension (local SDXL generation in the browser, no server). The "light" variant for weaker GPUs: smaller download + RAM, slightly softer detail.

Resolution: 1024×1024
Steps: 4, guidance 0 (single pass)
UNet: int4 (MatMulNBits), fp32 I/O — needs ORT 1.25+
Size: ~1.86 GB (UNet) + shared encoders/VAE

License

Derivative work combining three upstream sources (all permissive; commercial use permitted, subject to the RAIL++-M use restrictions):

UNet — ByteDance/SDXL-Lightning — CreativeML Open RAIL++-M
SDXL base (text encoders) — stabilityai/stable-diffusion-xl-base-1.0 — CreativeML Open RAIL++-M
VAE — madebyollin/sdxl-vae-fp16-fix — MIT

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for d0gr/sdxl-lightning-onnx-webgpu-int4

Base model

ByteDance/SDXL-Lightning

Quantized

(3)

this model