Why does checking "Load SDXL-Refiner" or using the refiner model, the speed of image generation extremely slow?

#4
by supwang - opened

Hi,
Model Version: SD-XL base, 8sec per image :)
Model Version: SD-XL Refiner, 15mins per image @_@

Is this a normal situation?
If I switched models, why the image generation speed of SD-XL base will also change to 15mins per image!?
Thank you.

You don't have enough VRAM. when I load on batch size 2 it uses 20GB VRAM ^^ my GPU can handle it but you may not have enough. this is a big 6.6b parameter model. you might be able to do it if you offload initial model first.

You don't have enough VRAM. when I load on batch size 2 it uses 20GB VRAM ^^ my GPU can handle it but you may not have enough. this is a big 6.6b parameter model. you might be able to do it if you offload initial model first.

Thank you. But I only load batch size 1 and I'm using 4090. Speed of refiner is too slow.

You don't have enough VRAM. when I load on batch size 2 it uses 20GB VRAM ^^ my GPU can handle it but you may not have enough. this is a big 6.6b parameter model. you might be able to do it if you offload initial model first.

Thank you. But I only load batch size 1 and I'm using 4090. Speed of refiner is too slow.

That's not normal, on my 3090 refiner takes no longer than the base model. Do you have other programs open consuming VRAM?

You don't have enough VRAM. when I load on batch size 2 it uses 20GB VRAM ^^ my GPU can handle it but you may not have enough. this is a big 6.6b parameter model. you might be able to do it if you offload initial model first.

Thank you. But I only load batch size 1 and I'm using 4090. Speed of refiner is too slow.

That's not normal, on my 3090 refiner takes no longer than the base model. Do you have other programs open consuming VRAM?

Nothing consuming VRAM, except SDXL. So it's strange. @_@

Hi,
Model Version: SD-XL base, 8sec per image :)
Model Version: SD-XL Refiner, 15mins per image @_@

Is this a normal situation?
If I switched models, why the image generation speed of SD-XL base will also change to 15mins per image!?
Thank you.

i have the same problems, also 4090

do not use:
pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)

there are some warning
when i use nvcr.io/nvidia/pytorch:23.06-py3 in docker container

do not use:
pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)

there are some warning
when i use nvcr.io/nvidia/pytorch:23.06-py3 in docker container

How to set the paramenter?

supwang changed discussion status to closed

Sign up or log in to comment