Is Waifu Diffusion 1.4 set for 768-square images or 512-square ones?

#24
by 07mk - opened

I see that the recently released Waifu Diffusion 1.4 is based on the Stable Diffusion 2.1 release, which is itself designed for 768-square image generation. Thus I would assume that Waifu Diffusion 1.4 is also designed for 768-square images. However, I wanted to verify this, because I understand that the resolution of the images used for fine-tuning matter for usage of a fine-tuned model like Waifu Diffusion.

I'm finding WD1.4 to be excellent for my inpainting needs whether it's 512-square or 768-square so far, but I wanted to know what the proper way to use it is.

I believe it's trained on 640 x 640 images. Or maybe I'm misunderstanding the tweet.

https://twitter.com/Birchlabs/status/1609364716320686081?s=20&t=j9VaMRXGS-NSUKWYP5UcOA

That tweet indicates that 640x640 is the 'average' resolution (non-square). So images that are 768x512 or 512x768 are probably in the training data.

From my testing, it handles 768x768 resolution pictures fine and without doubling up scenery.

I see, thanks for the Tweet link. It does seem like it's designed to be somewhat flexible for usage in different resolutions, which has been my experience so far. Good to know.

Sign up or log in to comment