--- base_model: runwayml/stable-diffusion-v1-5 library_name: diffusers license: creativeml-openrail-m tags: - stable-diffusion - stable-diffusion-diffusers - text-to-image - diffusers - controlnet - control-lora-v3 - diffusers-training inference: true --- # ControlLoRA Version 3 Pretrained Models Collection This is a collections of control-lora-v3 weights trained on runwayml/stable-diffusion-v1-5 and stabilityai/stable-diffusion-xl-base-1.0 with different types of conditioning. You can find some example images below. ## Stable Diffusion ### Canny

### OpenPose + Segmentation This is experimental, and it doesn't work well.

### Depth

### Normal map

### OpenPose

### Segmentation

### Tile

## Stable Diffusion XL ### Canny

## Intended uses & limitations #### How to use First clone the [control-lora-v3](https://github.com/HighCWu/control-lora-v3) and `cd` in the directory: ```sh git clone https://github.com/HighCWu/control-lora-v3 cd control-lora-v3 ``` Then run the python code。 For stable diffusion, use: ```py # !pip install opencv-python transformers accelerate from diffusers import UniPCMultistepScheduler from diffusers.utils import load_image from model import UNet2DConditionModelEx from pipeline import StableDiffusionControlLoraV3Pipeline import numpy as np import torch import cv2 from PIL import Image # download an image image = load_image( "https://hf.co/datasets/huggingface/documentation-images/resolve/main/diffusers/input_image_vermeer.png" ) image = np.array(image) # get canny image image = cv2.Canny(image, 100, 200) image = image[:, :, None] image = np.concatenate([image, image, image], axis=2) canny_image = Image.fromarray(image) # load stable diffusion v1-5 and control-lora-v3 unet: UNet2DConditionModelEx = UNet2DConditionModelEx.from_pretrained( "runwayml/stable-diffusion-v1-5", subfolder="unet", torch_dtype=torch.float16 ) unet = unet.add_extra_conditions(["canny"]) pipe = StableDiffusionControlLoraV3Pipeline.from_pretrained( "runwayml/stable-diffusion-v1-5", unet=unet, torch_dtype=torch.float16 ) # load attention processors # pipe.load_lora_weights("HighCWu/sd-control-lora-v3-canny") pipe.load_lora_weights("HighCWu/control-lora-v3", subfolder="sd-control-lora-v3-canny-half_skip_attn-rank16-conv_in-rank64") # speed up diffusion process with faster scheduler and memory optimization pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) # remove following line if xformers is not installed pipe.enable_xformers_memory_efficient_attention() pipe.enable_model_cpu_offload() # generate image generator = torch.manual_seed(0) image = pipe( "futuristic-looking woman", num_inference_steps=20, generator=generator, image=canny_image ).images[0] image.show() ``` For stable diffusion xl, use: ```py # !pip install opencv-python transformers accelerate from diffusers import AutoencoderKL from diffusers.utils import load_image from model import UNet2DConditionModelEx from pipeline_sdxl import StableDiffusionXLControlLoraV3Pipeline import numpy as np import torch import cv2 from PIL import Image prompt = "aerial view, a futuristic research complex in a bright foggy jungle, hard lighting" negative_prompt = "low quality, bad quality, sketches" # download an image image = load_image( "https://hf.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png" ) # initialize the models and pipeline unet: UNet2DConditionModelEx = UNet2DConditionModelEx.from_pretrained( "stabilityai/stable-diffusion-xl-base-1.0", subfolder="unet", torch_dtype=torch.float16 ) unet = unet.add_extra_conditions(["canny"]) vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16) pipe = StableDiffusionXLControlLoraV3Pipeline.from_pretrained( "stabilityai/stable-diffusion-xl-base-1.0", unet=unet, vae=vae, torch_dtype=torch.float16 ) # load attention processors # pipe.load_lora_weights("HighCWu/sdxl-control-lora-v3-canny") pipe.load_lora_weights("HighCWu/control-lora-v3", subfolder="sdxl-control-lora-v3-canny-half_skip_attn-rank16-conv_in-rank64") pipe.enable_model_cpu_offload() # get canny image image = np.array(image) image = cv2.Canny(image, 100, 200) image = image[:, :, None] image = np.concatenate([image, image, image], axis=2) canny_image = Image.fromarray(image) # generate image image = pipe( prompt, image=canny_image ).images[0] image.show() ``` #### Limitations and bias [TODO: provide examples of latent issues and potential remediations] ## Training details [TODO: describe the data used to train the model]