quantized 8 bit model?

#167
by Roronoa099 - opened

is there any 8 bit or 4 bit quantized model for this

In Brevitas (https://github.com/Xilinx/brevitas/tree/dev/src/brevitas_examples/stable_diffusion) we have a pipeline to quantize SDXL with several command line options to change the quantization configuration.

We support ONNX export to QCDQ format.

I am currently working on updating this script to improve and speed-up GPTQ computation, add FID score vs floating point version and other minor improvements.
You can find the PR here: https://github.com/Xilinx/brevitas/pull/951

Feel free to open an issue/discussion if you have any issue with this. It's being actively developed so bugs could be present.

Sign up or log in to comment