metadata

license: creativeml-openrail-m
library_name: diffusers
tags:
  - stable-diffusion
  - stable-diffusion-diffusers
  - text-to-image
  - diffusers
  - diffusers-training
  - lora
  - stable-diffusion
  - stable-diffusion-diffusers
  - text-to-image
  - diffusers
  - diffusers-training
  - lora
base_model: runwayml/stable-diffusion-v1-5
inference: true

LoRA text2image fine-tuning - animanatwork/illustrations-lora

These are LoRA adaption weights for runwayml/stable-diffusion-v1-5. The weights were fine-tuned on the animanatwork/text_to_image_dataset dataset.

Below, we can find some images from the dataset:

The images below are generated from the model using the prompt: "a stylized illustration of a woman sitting in a comfortable chair, reading a book. She is wearing a hat, and her expression appears focused and calm. A black cat is also depicted, sitting beside her and looking at the book, suggesting a shared moment of quiet and companionship. The woman is dressed in a casual outfit with yellow shoes, and the overall color scheme is simple, using black, white, and yellow. The setting seems cozy and peaceful, ideal for reading."

Intended uses & limitations

Do NOT use in production. This model was purely created for research purposes.

How to use

# TODO: add an example code snippet for running this diffusion pipeline

Limitations and bias

[TODO: provide examples of latent issues and potential remediations]

Training details

The model was trained on the "animanatwork/text_to_image_dataset" dataset using 10_000 training step (default is 15_000) and took several hours to train. For more details see Colab notebook.
The dataset's tokens were generated using chatGPT vision. During training, I noticed CLIP can only use 77 tokens for a given image. Since most of our image descriptions contained more tokens, we'll have to create a new dataset that doesn't exceed the maximum.

[TODO: describe the data used to train the model]