Is there an FP8 version?

#1
by zhaoqi - opened

The full version is too big, is there an fp8 version?

ShuttleAI org

@Fizzarolli

Please help with code to use Fp8 and GGUF version. These have code only for bf16 version.

https://huggingface.co/shuttleai/shuttle-3-diffusion-fp8
https://huggingface.co/shuttleai/shuttle-3-diffusion-GGUF

OSError: Error no file named model_index.json found in directory shuttleai/shuttle-3-diffusion-fp8.

ShuttleAI org

Iunno, you probably should be using a backend like ComfyUI or Forge for that

This comment has been hidden
ShuttleAI org

If you are using transformers, I recommend https://github.com/huggingface/optimum-quanto

Example code:

from optimum.quanto import freeze, quantize, qint8, qfloat8, qint4
quantize(
    pipe.transformer,
    # weights=qfloat8,
    weights=qint8,
    exclude=[
        "*.norm", "*.norm1", "*.norm2", "*.norm2_context",
        "proj_out", "x_embedder", "norm_out", "context_embedder",
    ],
)
freeze(pipe.transformer)
# pipe.enable_model_cpu_offload()```

Sign up or log in to comment