comfyui

The sampling acceleration from nvfp4 quantization in Krea2 is not significant.

#6
by Aca233 - opened

QQ20260625-164559FP8
QQ20260625-164632
NVFP4

RTX5060 64GB DDR4

Same here RTX5060Ti 16G, 96G RAM

Me,too.nvfp4 even slower than mxfp8. RTX5080,96G RAM

Comfy Org org

Possibly uploaded wrong version of it which doesn't allow the fast nvfp4 matmuls, re-uploaded now. For me it's ~15% faster than fp8 on 5090, not all layers could use nvfp4 matmuls due to rather bad quality loss, so it's not going to be that much faster, still should definitely not be slower at least.

Possibly uploaded wrong version of it which doesn't allow the fast nvfp4 matmuls, re-uploaded now. For me it's ~15% faster than fp8 on 5090, not all layers could use nvfp4 matmuls due to rather bad quality loss, so it's not going to be that much faster, still should definitely not be slower at least.

Thanks for the update! I just used the new nvfp4. Generating that 3840x2160 image went from 20s per step down to 15s per step.

Sign up or log in to comment