Overview

These models are made to work with stable-diffusion.cpp release master-ac54e00 onwards. Support for other inference backends is not guarenteed.

Quantized using this PR https://github.com/leejet/stable-diffusion.cpp/pull/447

Normal K-quants are not working properly with SD3.5-Large models because around 90% of the weights are in tensors whose shape doesn't match the 256 superblock size of K-quants and therefore can't be quantized this way. Mixing quantization types allows us to take adventage of the better fidelity of k-quants to some extent while keeping the model file size relatively small.

Only the second layers of both MLPs in each MMDiT block of SD3.5 Large models have the correct shape to be compatible with k-quants. That still makes up for about 10% of all the parameters.

Files:

Non-Linear Type:

  • sd3.5_large_turbo-iq4_nl.gguf: Same size as q4_k_4_0 and q4_0, runs faster than q4_k_4_0 (on Vulkan at least), and provides image quality somewhat comparable to q5_1 model. Recommended

Mixed Types:

Legacy types:

Outputs:

Sorted by model size (Note that q4_0, q4_k_4_0, and iq4_nl are the exact same size)

Quantization Robot girl Text Cute kitten
q2_k_4_0 q2_k_4_0 q2_k_4_0 q2_k_4_0
q3_k_4_0 q3_k_4_0 q3_k_4_0 q3_k_4_0
q4_0 q4_0 q4_0 q4_0
q4_k_4_0 q4_k_4_0 q4_k_4_0 q4_k_4_0
iq4_nl iq4_nl iq4_nl iq4_nl
q4_k_4_1 q4_k_4_1 q4_k_4_1 q4_k_4_1
q4_1 q4_1 q4_1 q4_1
q4_k_5_0 q4_k_5_0 q4_k_5_0 q4_k_5_0
q5_0 q5_0 q5_0 q5_0
q5_1 q5_1 q5_1 q5_1
q8_0 q8_0 q8_0 q8_0
f16(sft) f16 f16 f16

Generated with a modified version of sdcpp with this PR applied to enable clip timestep embeddings support.

Text encoders used: q4_k quant of t5xxl, full precision clip_g, and q8 quant of ViT-L-14-TEXT-detail-improved-hiT-GmP-TE-only-HF in place of clip_l.

Full prompts and settings in png metadata.

Downloads last month
347
GGUF
Model size
8.23B params
Architecture
undefined

2-bit

3-bit

4-bit

5-bit

8-bit

Inference API
Unable to determine this model's library. Check the docs .

Model tree for stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp

Quantized
(5)
this model