Is there an FP8 version?

by zhaoqi - opened Nov 13, 2024

Discussion

zhaoqi

Nov 13, 2024

The full version is too big, is there an fp8 version?

Fizzarolli

ShuttleAI org Nov 13, 2024

now there is https://huggingface.co/shuttleai/shuttle-3-diffusion-fp8

fullsoftwares

Nov 14, 2024

•

edited Nov 14, 2024

@Fizzarolli

Please help with code to use Fp8 and GGUF version. These have code only for bf16 version.

https://huggingface.co/shuttleai/shuttle-3-diffusion-fp8
https://huggingface.co/shuttleai/shuttle-3-diffusion-GGUF

OSError: Error no file named model_index.json found in directory shuttleai/shuttle-3-diffusion-fp8.

Fizzarolli

ShuttleAI org Nov 14, 2024

Iunno, you probably should be using a backend like ComfyUI or Forge for that

fullsoftwares

Nov 14, 2024

This comment has been hidden

xtristan

ShuttleAI org Nov 14, 2024

If you are using transformers, I recommend https://github.com/huggingface/optimum-quanto

Example code:

from optimum.quanto import freeze, quantize, qint8, qfloat8, qint4
quantize(
    pipe.transformer,
    # weights=qfloat8,
    weights=qint8,
    exclude=[
        "*.norm", "*.norm1", "*.norm2", "*.norm2_context",
        "proj_out", "x_embedder", "norm_out", "context_embedder",
    ],
)
freeze(pipe.transformer)
# pipe.enable_model_cpu_offload()```

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment