Hyperparameters and steps to sudden convergence

#9
by brycegoh - opened

Hi Diffuser team, I am attempting to train my own controlnet and was wondering if your team has any tips for training a sdxl controlnet?

Given that your team seem to use a different script from the documented one, it seems like you use the EDM formulation.

Would appreciate opinions on:

  1. How does the EDM formulation help the training process training?
  2. What hyperparameters did your team use (I see that batch size and LR is documented but what about gradient accumulation, etc)?
  3. Did your team notice any patterns for sudden convergence?

Would appreciate some assistance and advise! Thanks!

cc: @patrickvonplaten

Sign up or log in to comment