LoRA
Thanks for this model, it works great!
Can we create LoRAs with this model? Meaning, will the diffusers LoRA training code work with this model?
Yes, All the Diffusers Training scripts are fully supported with SSD-1B!
@jffacevedo
If you create a LoRA for this model, please share it, so I can test if the Diffusers LoRA loading code will support loading LoRAs on SSD-1B
(I'm on 6 GB VRAM and LoRA training is not possible for me)
Thank you, I"ll give it a try and share the results.
@jffacevedo Were you able to train the LoRA successfully?
@Warlord-K
I was able to train successfully, but the validation step of the script failed with RuntimeError: Input type (c10::Half) and bias type (float) should be the same
. It still saved the checkpoints, here is after 2 epochs.
With LoRA
Without LoRA
See the full logs below:
accelerate launch train_
text_to_image_lora_sdxl.py --pretrained_model_name_or_path=$MODEL_NAME --dataset_name=$DATASET_NAME
--caption_column="text" --resolution=1024 --random_flip --train_batch_size=1 --num_train_epochs=2
--checkpointing_steps=500 --learning_rate=1e-04 --lr_scheduler="constant" --lr_warmup_steps=0 --mi
xed_precision="fp16" --seed=42 --output_dir="sd-pokemon-model-lora-sdxl" --validation_prompt="cut
e dragon creature"
The following values were not passed to `accelerate launch` and had defaults used instead:
`--num_processes` was set to a value of `1`
`--num_machines` was set to a value of `1`
`--mixed_precision` was set to a value of `'no'`
`--dynamo_backend` was set to a value of `'no'`
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
11/07/2023 01:01:10 - INFO - __main__ - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda
Mixed precision type: fp16
You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
{'dynamic_thresholding_ratio', 'variance_type', 'thresholding', 'clip_sample_range'} was not found in config. Values will be initialized to default values.
Downloading model.safetensors: 100%|βββββββββββββββββββββββββββββββββ| 492M/492M [00:02<00:00, 220MB/s]
Downloading model.safetensors: 100%|ββββββββββββββββββββββββββββββ| 2.78G/2.78G [01:10<00:00, 39.7MB/s]
Downloading (β¦)ch_model.safetensors: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 335M/335M [00:01<00:00, 200MB/s]
Downloading (β¦)ch_model.safetensors: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 5.33G/5.33G [00:23<00:00, 230MB/s]
{'attention_type', 'dropout'} was not found in config. Values will be initialized to default values.
Downloading readme: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1.80k/1.80k [00:00<00:00, 11.8MB/s]
Downloading metadata: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 731/731 [00:00<00:00, 5.92MB/s]
Downloading data: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 99.7M/99.7M [00:02<00:00, 39.7MB/s]
Downloading data files: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1/1 [00:02<00:00, 2.51s/it]
Extracting data files: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1/1 [00:00<00:00, 1508.74it/s]
Generating train split: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 833/833 [00:00<00:00, 2920.80 examples/s]
11/07/2023 01:03:08 - INFO - __main__ - ***** Running training *****
11/07/2023 01:03:08 - INFO - __main__ - Num examples = 833
11/07/2023 01:03:08 - INFO - __main__ - Num Epochs = 2
11/07/2023 01:03:08 - INFO - __main__ - Instantaneous batch size per device = 1
11/07/2023 01:03:08 - INFO - __main__ - Total train batch size (w. parallel, distributed & accumulation) = 1
11/07/2023 01:03:08 - INFO - __main__ - Gradient Accumulation steps = 1
11/07/2023 01:03:08 - INFO - __main__ - Total optimization steps = 1666
Steps: 30%|βββββββββββββββββββββββ | 500/1666 [08:46<20:27, 1.05s/it, lr=0.0001, step_loss=0.00503]11/07/2023 01:11:55 - INFO - accelerate.accelerator - Saving current state to sd-pokemon-model-lora-sdxl/checkpoint-500
Model weights saved in sd-pokemon-model-lora-sdxl/checkpoint-500/pytorch_lora_weights.safetensors
11/07/2023 01:11:55 - INFO - accelerate.checkpointing - Optimizer state saved in sd-pokemon-model-lora-sdxl/checkpoint-500/optimizer.bin
11/07/2023 01:11:55 - INFO - accelerate.checkpointing - Scheduler state saved in sd-pokemon-model-lora-sdxl/checkpoint-500/scheduler.bin
11/07/2023 01:11:55 - INFO - accelerate.checkpointing - Gradient scaler state saved in sd-pokemon-model-lora-sdxl/checkpoint-500/scaler.pt
11/07/2023 01:11:55 - INFO - accelerate.checkpointing - Random states saved in sd-pokemon-model-lora-sdxl/checkpoint-500/random_states_0.pkl
11/07/2023 01:11:55 - INFO - __main__ - Saved state to sd-pokemon-model-lora-sdxl/checkpoint-500
Steps: 50%|βββββββββββββββββββββββββββββββββββββββ | 833/1666 [14:36<14:27, 1.04s/it, lr=0.0001, step_loss=0.00567]11/07/2023 01:17:45 - INFO - __main__ - Running validation...
Generating 4 images with prompt: cute dragon creature.
{'add_watermarker'} was not found in config. Values will be initialized to default values.
Loaded scheduler as EulerDiscreteScheduler from `scheduler` subfolder of segmind/SSD-1B. | 0/7 [00:00<?, ?it/s]
Loaded tokenizer_2 as CLIPTokenizer from `tokenizer_2` subfolder of segmind/SSD-1B.
Loaded tokenizer as CLIPTokenizer from `tokenizer` subfolder of segmind/SSD-1B.
Loading pipeline components...: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 7/7 [00:00<00:00, 53.80it/s]
Steps: 60%|βββββββββββββββββββββββββββββββββββββββββββββββ | 1000/1666 [18:46<11:39, 1.05s/it, lr=0.0001, step_loss=0.105]11/07/2023 01:21:55 - INFO - accelerate.accelerator - Saving current state to sd-pokemon-model-lora-sdxl/checkpoint-1000
Model weights saved in sd-pokemon-model-lora-sdxl/checkpoint-1000/pytorch_lora_weights.safetensors
11/07/2023 01:21:55 - INFO - accelerate.checkpointing - Optimizer state saved in sd-pokemon-model-lora-sdxl/checkpoint-1000/optimizer.bin
11/07/2023 01:21:55 - INFO - accelerate.checkpointing - Scheduler state saved in sd-pokemon-model-lora-sdxl/checkpoint-1000/scheduler.bin
11/07/2023 01:21:55 - INFO - accelerate.checkpointing - Gradient scaler state saved in sd-pokemon-model-lora-sdxl/checkpoint-1000/scaler.pt
11/07/2023 01:21:55 - INFO - accelerate.checkpointing - Random states saved in sd-pokemon-model-lora-sdxl/checkpoint-1000/random_states_0.pkl
11/07/2023 01:21:55 - INFO - __main__ - Saved state to sd-pokemon-model-lora-sdxl/checkpoint-1000
Steps: 90%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 1500/1666 [27:31<02:53, 1.05s/it, lr=0.0001, step_loss=0.0159]11/07/2023 01:30:40 - INFO - accelerate.accelerator - Saving current state to sd-pokemon-model-lora-sdxl/checkpoint-1500
Model weights saved in sd-pokemon-model-lora-sdxl/checkpoint-1500/pytorch_lora_weights.safetensors
11/07/2023 01:30:40 - INFO - accelerate.checkpointing - Optimizer state saved in sd-pokemon-model-lora-sdxl/checkpoint-1500/optimizer.bin
11/07/2023 01:30:40 - INFO - accelerate.checkpointing - Scheduler state saved in sd-pokemon-model-lora-sdxl/checkpoint-1500/scheduler.bin
11/07/2023 01:30:40 - INFO - accelerate.checkpointing - Gradient scaler state saved in sd-pokemon-model-lora-sdxl/checkpoint-1500/scaler.pt
11/07/2023 01:30:40 - INFO - accelerate.checkpointing - Random states saved in sd-pokemon-model-lora-sdxl/checkpoint-1500/random_states_0.pkl
11/07/2023 01:30:40 - INFO - __main__ - Saved state to sd-pokemon-model-lora-sdxl/checkpoint-1500
Steps: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1666/1666 [30:26<00:00, 1.04s/it, lr=0.0001, step_loss=0.126]11/07/2023 01:33:35 - INFO - __main__ - Running validation...
Generating 4 images with prompt: cute dragon creature.
{'add_watermarker'} was not found in config. Values will be initialized to default values.
Loaded scheduler as EulerDiscreteScheduler from `scheduler` subfolder of segmind/SSD-1B. | 0/7 [00:00<?, ?it/s]
Loaded tokenizer_2 as CLIPTokenizer from `tokenizer_2` subfolder of segmind/SSD-1B.
Loaded tokenizer as CLIPTokenizer from `tokenizer` subfolder of segmind/SSD-1B.
Loading pipeline components...: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 7/7 [00:00<00:00, 60.59it/s]
Model weights saved in sd-pokemon-model-lora-sdxl/pytorch_lora_weights.safetensorsββββββββββββββββββββββββββββββ | 6/7 [00:00<00:00, 52.13it/s]
{'add_watermarker'} was not found in config. Values will be initialized to default values.
Loaded scheduler as EulerDiscreteScheduler from `scheduler` subfolder of segmind/SSD-1B. | 0/7 [00:00<?, ?it/s]
Loaded text_encoder as CLIPTextModel from `text_encoder` subfolder of segmind/SSD-1B.
Loaded tokenizer_2 as CLIPTokenizer from `tokenizer_2` subfolder of segmind/SSD-1B. | 3/7 [00:00<00:00, 9.52it/s]
{'attention_type', 'dropout'} was not found in config. Values will be initialized to default values.
Loaded unet as UNet2DConditionModel from `unet` subfolder of segmind/SSD-1B.
Loaded tokenizer as CLIPTokenizer from `tokenizer` subfolder of segmind/SSD-1B.ββββββββββββββββββββββ | 5/7 [00:02<00:01, 1.89it/s]
Loaded text_encoder_2 as CLIPTextModelWithProjection from `text_encoder_2` subfolder of segmind/SSD-1B.
Loading pipeline components...: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 7/7 [00:03<00:00, 1.93it/s]
Loading unet.ine components...: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 7/7 [00:03<00:00, 1.69it/s]
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 25/25 [00:06<00:00, 3.72it/s]
Traceback (most recent call last):ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 25/25 [00:06<00:00, 3.71it/s]
File "/home/jfacevedo_google_com/diffusers/examples/text_to_image/train_text_to_image_lora_sdxl.py", line 1265, in <module>
main(args)
File "/home/jfacevedo_google_com/diffusers/examples/text_to_image/train_text_to_image_lora_sdxl.py", line 1224, in main
images = [
File "/home/jfacevedo_google_com/diffusers/examples/text_to_image/train_text_to_image_lora_sdxl.py", line 1225, in <listcomp>
pipeline(args.validation_prompt, num_inference_steps=25, generator=generator).images[0]
File "/opt/conda/envs/sdxl/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/opt/conda/envs/sdxl/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py", line 1057, in __call__
image = self.vae.decode(latents / self.vae.config.scaling_factor, return_dict=False)[0]
File "/opt/conda/envs/sdxl/lib/python3.10/site-packages/diffusers/utils/accelerate_utils.py", line 46, in wrapper
return method(self, *args, **kwargs)
File "/opt/conda/envs/sdxl/lib/python3.10/site-packages/diffusers/models/autoencoder_kl.py", line 316, in decode
decoded = self._decode(z).sample
File "/opt/conda/envs/sdxl/lib/python3.10/site-packages/diffusers/models/autoencoder_kl.py", line 288, in _decode
z = self.post_quant_conv(z)
File "/opt/conda/envs/sdxl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/envs/sdxl/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/opt/conda/envs/sdxl/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (c10::Half) and bias type (float) should be the same
Steps: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1666/1666 [31:52<00:00, 1.15s/it, lr=0.0001, step_loss=0.126]
Traceback (most recent call last):
File "/opt/conda/envs/sdxl/bin/accelerate", line 8, in <module>
sys.exit(main())
File "/opt/conda/envs/sdxl/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
args.func(args)
File "/opt/conda/envs/sdxl/lib/python3.10/site-packages/accelerate/commands/launch.py", line 986, in launch_command
simple_launcher(args)
File "/opt/conda/envs/sdxl/lib/python3.10/site-packages/accelerate/commands/launch.py", line 628, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/opt/conda/envs/sdxl/bin/python3.10', 'train_text_to_image_lora_sdxl.py', '--pretrained_model_name_or_path=segmind/SSD-1B', '--dataset_name=lambdalabs/pokemon-blip-captions', '--caption_column=text', '--resolution=1024', '--random_flip', '--train_batch_size=1', '--num_train_epochs=2', '--checkpointing_steps=500', '--learning_rate=1e-04', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--mixed_precision=fp16', '--seed=42', '--output_dir=sd-pokemon-model-lora-sdxl', '--validation_prompt=cute dragon creature']' returned non-zero exit status 1.
Something might have gone wrong in the training, We'll try the same and get back to you, Thanks for reporting!
I also tried to train a LoRA for SSD-1B but I'm getting this error (Missing key(s) in state_dict): https://github.com/bmaltais/kohya_ss/issues/1665
Is this related to kohya_ss or the model?
Have you tried updating Kohya? It seems like it didn't recognize the model and expects a larger state_dict.
Have you tried updating Kohya? It seems like it didn't recognize the model and expects a larger state_dict.
yeah, I just rechecked and my clone is up to date.
how can i train this model for multiple images of different people example a person named has his all pics in p1 folder ,and a person named p2 has all his pics in p2 how do i do it now i want my model to give the accurate pics when i say p1 and p2
You can use the dreambooth lora training script available in diffusers