Stable Diffusion api

A browser interface based on Gradio library for Stable Diffusion.

Installation and Running

Make sure the required dependencies are met and follow the instructions available for both NVidia (recommended) and AMD GPUs.

Alternatively, use online services (like Google Colab):

List of Online Services

Automatic Installation on Windows

Install Python 3.10.6 (Newer version of Python does not support torch), checking "Add Python to PATH".
Install git.
Download the stable-diffusion-webui repository, for example by running git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git.
Run webui-user.bat from Windows Explorer as normal, non-administrator, user.

Automatic Installation on Linux

Install the dependencies:

# Debian-based:
sudo apt install wget git python3 python3-venv
# Red Hat-based:
sudo dnf install wget git python3
# Arch-based:
sudo pacman -S wget git python3

Navigate to the directory you would like the webui to be installed and execute the following command:

bash <(wget -qO- https://raw.githubusercontent.com/AUTOMATIC1111/stable-diffusion-webui/master/webui.sh)

Run webui.sh.
Check webui-user.sh for options.

Installation on Apple Silicon

Find the instructions here.

Documentation

The documentation was moved from this README over to the project's wiki.

How to use

Starting from ControlNet 1.1, we begin to use the Standard ControlNet Naming Rules (SCNNRs) to name all models. We hope that this naming rule can improve the user experience.

ControlNet 1.1 include 14 models (11 production-ready models and 3 experimental models):

control_v11p_sd15_canny
control_v11p_sd15_mlsd
control_v11f1p_sd15_depth
control_v11p_sd15_normalbae
control_v11p_sd15_seg
control_v11p_sd15_inpaint
control_v11p_sd15_lineart
control_v11p_sd15s2_lineart_anime
control_v11p_sd15_openpose
control_v11p_sd15_scribble
control_v11p_sd15_softedge
control_v11e_sd15_shuffle
control_v11e_sd15_ip2p
control_v11f1e_sd15_tile

You need to download these models and putting them in the stable-diffusion-webui/models/ControlNet path. You can download all those models from our HuggingFace Model Page.

You need to download Stable Diffusion 1.5 model "v1-5-pruned.ckpt" and put it in the stable-diffusion-webui/models/Stable-diffusion path.

Our python codes will automatically download other annotator models like HED and OpenPose. Nevertheless, if you want to manually download these, you can download all other annotator models from here. All these models should be put in folder "annotator/ckpts".

At the end, you need to first run "bash webui.sh --nowebui" in on terminal and then by running following scripts you can use api feature that we completed in this project.

test_api_text2img.py
test_api_img2img.py
test_api_text2img_controlNet.py
test_api_img2img_controlNet.py

In each test ... .py files you can see one or two dict that you can configure your execution by changing them.

Credits

Licenses for borrowed code can be found in Settings -> Licenses screen, and also in html/licenses.html file.

Stable Diffusion - https://github.com/CompVis/stable-diffusion, https://github.com/CompVis/taming-transformers
k-diffusion - https://github.com/crowsonkb/k-diffusion.git
GFPGAN - https://github.com/TencentARC/GFPGAN.git
CodeFormer - https://github.com/sczhou/CodeFormer
ESRGAN - https://github.com/xinntao/ESRGAN
SwinIR - https://github.com/JingyunLiang/SwinIR
Swin2SR - https://github.com/mv-lab/swin2sr
LDSR - https://github.com/Hafiidz/latent-diffusion
MiDaS - https://github.com/isl-org/MiDaS
Ideas for optimizations - https://github.com/basujindal/stable-diffusion
Cross Attention layer optimization - Doggettx - https://github.com/Doggettx/stable-diffusion, original idea for prompt editing.
Cross Attention layer optimization - InvokeAI, lstein - https://github.com/invoke-ai/InvokeAI (originally http://github.com/lstein/stable-diffusion)
Sub-quadratic Cross Attention layer optimization - Alex Birch (https://github.com/Birch-san/diffusers/pull/1), Amin Rezaei (https://github.com/AminRezaei0x443/memory-efficient-attention)
Textual Inversion - Rinon Gal - https://github.com/rinongal/textual_inversion (we're not using his code, but we are using his ideas).
Idea for SD upscale - https://github.com/jquesnelle/txt2imghd
Noise generation for outpainting mk2 - https://github.com/parlance-zz/g-diffuser-bot
CLIP interrogator idea and borrowing some code - https://github.com/pharmapsychotic/clip-interrogator
Idea for Composable Diffusion - https://github.com/energy-based-model/Compositional-Visual-Generation-with-Composable-Diffusion-Models-PyTorch
xformers - https://github.com/facebookresearch/xformers
DeepDanbooru - interrogator for anime diffusers https://github.com/KichangKim/DeepDanbooru
Sampling in float32 precision from a float16 UNet - marunine for the idea, Birch-san for the example Diffusers implementation (https://github.com/Birch-san/diffusers-play/tree/92feee6)
Instruct pix2pix - Tim Brooks (star), Aleksander Holynski (star), Alexei A. Efros (no star) - https://github.com/timothybrooks/instruct-pix2pix
UniPC sampler - Wenliang Zhao - https://github.com/wl-zhao/UniPC