Question of the sample size in VAE and UNet

#171
by MAPLELEAF3659 - opened

It's anyone know how sample size work in SD's VAE and UNet?
All I know is the SD v1.5 was trained with 512512, so it can generate 512512 more properly. But when I set the pipeline like 384384 or even 768768, it seems it can generate it as well (but less correctly).
I wondering could the SD (or the LDM) have ability of generalization to different sample size, so it's possible to inference in any width and height? If so, how its work in training and inference?

Sign up or log in to comment