SDXL-Lightning ONNX (WebGPU, 4-bit / small)
Lightweight ONNX build of ByteDance/SDXL-Lightning
(4-step) for in-browser inference via onnxruntime-web (WebGPU). The UNet is
4-bit weight-only quantized (MatMulNBits) into a single model_q4.onnx (~1.86 GB).
Text encoders, VAE and tokenizers are shared from the fp16 repo
(d0gr/sdxl-lightning-onnx-web-fp16).
Used by the Generate AI Images extension (local SDXL generation in the browser, no server). The "light" variant for weaker GPUs: smaller download + RAM, slightly softer detail.
- Resolution: 1024ร1024
- Steps: 4, guidance 0 (single pass)
- UNet: int4 (MatMulNBits), fp32 I/O โ needs ORT 1.25+
- Size: ~1.86 GB (UNet) + shared encoders/VAE
License
Derivative work combining three upstream sources (all permissive; commercial use permitted, subject to the RAIL++-M use restrictions):
- UNet โ ByteDance/SDXL-Lightning โ CreativeML Open RAIL++-M
- SDXL base (text encoders) โ stabilityai/stable-diffusion-xl-base-1.0 โ CreativeML Open RAIL++-M
- VAE โ madebyollin/sdxl-vae-fp16-fix โ MIT
Model tree for d0gr/sdxl-lightning-onnx-webgpu-int4
Base model
ByteDance/SDXL-Lightning