metadata

license: creativeml-openrail-m
tags:
  - stable-diffusion
  - text-to-image
datasets:
  - ProGamerGov/StableDiffusion-v1-5-Regularization-Images

Ukeiyo-style Diffusion

This is the fine-tuned Stable Diffusion model trained on traditional Japanese Ukeiyo-style images. Use the tokens ukeiyoddim style in your prompts for the effect. The model repo also contains a ckpt file , so that you can use the model with your own implementation of stable diffusion.

🧨 Diffusers

This model can be used just like any other Stable Diffusion model. For more information, please have a look at the Stable Diffusion.

You can also export the model to ONNX, MPS and/or FLAX/JAX.

#!pip install diffusers transformers scipy torch
from diffusers import StableDiffusionPipeline
import torch
model_id = "salmonhumorous/ukeiyo-style-diffusion"
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe = pipe.to("cuda")
prompt = "illustration of ukeiyoddim style landscape"
image = pipe(prompt).images[0]
image.save("./ukeiyo_landscape.png")

Training procedure and data

The training for this model was done using a RTX 3090. The training was completed in 28 minutes for a total of 2000 steps. A total of 33 instance images (Images of the style I was aiming for) and 1k Regularization images was used. Regularization images dataset used by ProGamerGov.

Training notebook used by Shivam Shrirao.

Training hyperparameters

The following hyperparameters were used during training:

number of steps : 2000
learning_rate: 1e-6
train_batch_size: 1
scheduler_type: DDIM
number of instance images : 33
number of regularization images : 1000
lr_scheduler : constant
gradient_checkpointing

Results

Below are the sample results for different training steps :

Sample images by model trained for 2000 steps :

prompt = "landscape" prompt = "ukeiyoddim style landscape" prompt = " illustration of ukeiyoddim style landscape"

Acknowledgement

Many thanks to nitrosocke, for inspiration and for the guide. Also thanks, to all the amazing people making stable diffusion easily accessible for everyone.