why lora preview images looks very different during training process when using cfg=1 and cfg=4?

#7
by adwad - opened

when training cfg=1 and validate cfg=3.5, the process looks like following. not really like the role i'm training but still a normal image.
flux_lora_trainer_sheet_00004_.png
when training cfg=4 and validate cfg=3.5, there are obvious noise when underfitting
flux_lora_trainer_sheet_00005_.png

new to lora, curious about that

What is the model used to generate the samples ? I'm not on the latest but I had to disable samples because they would come horrible systematically when training on de-distilled
First, the default setting was using only 20 steps where dedistilled needs at least 40.
Then there was no support for CFG > 4
Dedistilled is TERRIBLE when using CFG = 1. This looks close to it

You have to know what settings are used to generate the samples before concluding anything.

What is the model used to generate the samples ? I'm not on the latest but I had to disable samples because they would come horrible systematically when training on de-distilled
First, the default setting was using only 20 steps where dedistilled needs at least 40.
Then there was no support for CFG > 4
Dedistilled is TERRIBLE when using CFG = 1. This looks close to it

You have to know what settings are used to generate the samples before concluding anything.

Thank you for replying! This really has troubled me for two or three days.
For images above:
Images 1 was trained by the original flux dev (distilled) unet with cfg=1 and generate by it with cfg=3.5.
Images 2(the terrible one) was trained by de-distilled unet and generate by it, both cfg=3.5.

The good news is that I tried to generate with de-distill lora and distill unet today, it seems to become all normal!
Also, I noticed that de-distill model seems to need a different guidance node for inference, otherwise it generates terrible results anyway even without lora. Since I used the validate node from FluxTrainer (https://github.com/kijai/ComfyUI-FluxTrainer) , made for original flux dev training, to see preview of generated images, I guess that's why it results bad.
After using a new workflow and set steps to 60 following your advice, de-distill model does generate good results with or without lora, too!
Though, I'm not really sure how the new workflow made things different, one thing I noticed is that it contains negative guidance. I will attach the workflows below.

Another problem is that lora trained by original unet with cfg>1 and applied on original unet still generates bad result!! (not as bad as image 2 above but neither so good, see following image3&4). I can find no reason for that. May the original model just works bad trained with cfg>1? Would it better if I train more steps?
image3 train cfg=6 100 steps, test cfg=3.5 20 steps, both distill unet:
企业微信截图_17348809831855.png
image4 train cfg=3.5 100 steps, test cfg=3.5 20 steps, both distill unet:
企业微信截图_17348845062981.png
image5 train cfg=1 100 steps, test cfg=3.5 20 steps, both distill unet: apparently much more consistent role and style, and stable face
企业微信截图_17348853462943.png

(I will upload a better one for image5 but have to go sleep now...)

And I'm also not sure about that if FluxTrainer nodes don't work for de-distill inference, would it cause problem on training too? I haven't seen any problem using it training lora, the lora trained by the nodes and de-distill model looks all fine, but theoretically if it infers differently, it should train differently too? Would that decrease the quality of trained lora?

Infer workflow for de-distill: [https://files.catbox.moe/d362dg.png](You may already know, ComfyUI can open PNG files as workflows.)
Infer workflow normal: [https://comfyui-wiki.com/en/tutorial/advanced/flux1-comfyui-guide-workflow-and-examples], see "Part 2: Flux.1 ComfyUI Original Workflow Example"

Sign up or log in to comment