OFA-Sys/small-stable-diffusion-v0 · Nice model, but can't fine-tune

I just fined tuned https://huggingface.co/lambdalabs/miniSD-diffusers
which is a similar model to this except with 256x256 resultion.

The fine-tuning went well, but when i use the same approach on this model, i quickly get zero step loss at 100 steps. Do you know why? Here i my params:

!accelerate launch --mixed_precision="fp16" train_text_to_image.py \
  --pretrained_model_name_or_path=OFA-Sys/small-stable-diffusion-v0 \
  --use_ema \
  --resolution=512 \
  --train_batch_size=64 \
  --max_train_steps=1000000 \
  --checkpointing_steps=200 \
  --learning_rate=4e-7 \
  --max_grad_norm=1 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --noise_offset=0.05 \

I tried these LR:

4e-7
1e-5
-1e-6
and always get the zero loss

train_text_to_image.py is https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image.py