SDXL-Lightning ONNX (WebGPU, 4-bit / small)

Lightweight ONNX build of ByteDance/SDXL-Lightning (4-step) for in-browser inference via onnxruntime-web (WebGPU). The UNet is 4-bit weight-only quantized (MatMulNBits) into a single model_q4.onnx (~1.86 GB). Text encoders, VAE and tokenizers are shared from the fp16 repo (d0gr/sdxl-lightning-onnx-web-fp16).

Used by the Generate AI Images extension (local SDXL generation in the browser, no server). The "light" variant for weaker GPUs: smaller download + RAM, slightly softer detail.

  • Resolution: 1024ร—1024
  • Steps: 4, guidance 0 (single pass)
  • UNet: int4 (MatMulNBits), fp32 I/O โ€” needs ORT 1.25+
  • Size: ~1.86 GB (UNet) + shared encoders/VAE

License

Derivative work combining three upstream sources (all permissive; commercial use permitted, subject to the RAIL++-M use restrictions):

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for d0gr/sdxl-lightning-onnx-webgpu-int4

Quantized
(3)
this model