The sampling acceleration from nvfp4 quantization in Krea2 is not significant.

by Aca233 - opened 1 day ago

Discussion

Aca233

1 day ago

FP8

NVFP4

Aca233

1 day ago

RTX5060 64GB DDR4

Lowlay

1 day ago

Same here RTX5060Ti 16G, 96G RAM

zdjun1984

about 24 hours ago

Me,too.nvfp4 even slower than mxfp8. RTX5080,96G RAM

Kijai

Comfy Org org about 24 hours ago

Possibly uploaded wrong version of it which doesn't allow the fast nvfp4 matmuls, re-uploaded now. For me it's ~15% faster than fp8 on 5090, not all layers could use nvfp4 matmuls due to rather bad quality loss, so it's not going to be that much faster, still should definitely not be slower at least.

Aca233

about 23 hours ago

Possibly uploaded wrong version of it which doesn't allow the fast nvfp4 matmuls, re-uploaded now. For me it's ~15% faster than fp8 on 5090, not all layers could use nvfp4 matmuls due to rather bad quality loss, so it's not going to be that much faster, still should definitely not be slower at least.

Thanks for the update! I just used the new nvfp4. Generating that 3840x2160 image went from 20s per step down to 15s per step.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment