|
--- |
|
base_model: runwayml/stable-diffusion-v1-5 |
|
library_name: diffusers |
|
license: creativeml-openrail-m |
|
tags: |
|
- stable-diffusion |
|
- stable-diffusion-diffusers |
|
- text-to-image |
|
- diffusers |
|
- controlnet |
|
- control-lora-v3 |
|
- diffusers-training |
|
inference: true |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the training script had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
|
|
# ControlLoRA Version 3 Pretrained Models Collection |
|
|
|
This is a collections of control-lora-v3 weights trained on runwayml/stable-diffusion-v1-5 and stabilityai/stable-diffusion-xl-base-1.0 with different types of conditioning. |
|
You can find some example images below. |
|
|
|
## Stable Diffusion |
|
|
|
### Canny |
|
<div style="display: flex; flex-wrap: wrap;"> |
|
<img src="./imgs/canny1.png" style="height:256px;" /> |
|
<img src="./imgs/canny2.png" style="height:256px;" /> |
|
<img src="./imgs/canny3.png" style="height:256px;" /> |
|
<img src="./imgs/canny4.png" style="height:256px;" /> |
|
<img src="./imgs/canny_vermeer.png" style="height:256px;" /> |
|
</div> |
|
|
|
### OpenPose + Segmentation |
|
|
|
This is experimental, and it doesn't work well. |
|
|
|
<div style="display: flex; flex-wrap: wrap;"> |
|
<img src="./imgs/pose5_segmentation5.png" style="height:256px;" /> |
|
<img src="./imgs/pose6_segmentation6.png" style="height:256px;" /> |
|
<img src="./imgs/pose7_segmentation7.png" style="height:256px;" /> |
|
<img src="./imgs/pose8_segmentation8.png" style="height:256px;" /> |
|
</div> |
|
|
|
### Depth |
|
<div style="display: flex; flex-wrap: wrap;"> |
|
<img src="./imgs/depth1.png" style="height:256px;" /> |
|
<img src="./imgs/depth2.png" style="height:256px;" /> |
|
<img src="./imgs/depth3.png" style="height:256px;" /> |
|
<img src="./imgs/depth4.png" style="height:256px;" /> |
|
</div> |
|
|
|
### Normal map |
|
<div style="display: flex; flex-wrap: wrap;"> |
|
<img src="./imgs/normal1.png" style="height:256px;" /> |
|
<img src="./imgs/normal2.png" style="height:256px;" /> |
|
<img src="./imgs/normal3.png" style="height:256px;" /> |
|
<img src="./imgs/normal4.png" style="height:256px;" /> |
|
</div> |
|
|
|
### OpenPose |
|
<div style="display: flex; flex-wrap: wrap;"> |
|
<img src="./imgs/pose1.png" style="height:256px;" /> |
|
<img src="./imgs/pose2.png" style="height:256px;" /> |
|
<img src="./imgs/pose3.png" style="height:256px;" /> |
|
<img src="./imgs/pose4.png" style="height:256px;" /> |
|
<img src="./imgs/pose5.png" style="height:256px;" /> |
|
<img src="./imgs/pose6.png" style="height:256px;" /> |
|
<img src="./imgs/pose7.png" style="height:256px;" /> |
|
<img src="./imgs/pose8.png" style="height:256px;" /> |
|
</div> |
|
|
|
### Segmentation |
|
|
|
<div style="display: flex; flex-wrap: wrap;"> |
|
<img src="./imgs/segmentation1.png" style="height:256px;" /> |
|
<img src="./imgs/segmentation2.png" style="height:256px;" /> |
|
<img src="./imgs/segmentation3.png" style="height:256px;" /> |
|
<img src="./imgs/segmentation4.png" style="height:256px;" /> |
|
<img src="./imgs/segmentation5.png" style="height:256px;" /> |
|
<img src="./imgs/segmentation6.png" style="height:256px;" /> |
|
<img src="./imgs/segmentation7.png" style="height:256px;" /> |
|
<img src="./imgs/segmentation8.png" style="height:256px;" /> |
|
</div> |
|
|
|
### Tile |
|
<div style="display: flex; flex-wrap: wrap;"> |
|
<img src="./imgs/tile1.png" style="height:256px;" /> |
|
<img src="./imgs/tile2.png" style="height:256px;" /> |
|
<img src="./imgs/tile3.png" style="height:256px;" /> |
|
<img src="./imgs/tile4.png" style="height:256px;" /> |
|
</div> |
|
|
|
## Stable Diffusion XL |
|
|
|
### Canny |
|
<div style="display: flex;"> |
|
<img src="./imgs/sdxl_canny1.png" style="height:256px;" /> |
|
<img src="./imgs/sdxl_canny2.png" style="height:256px;" /> |
|
<img src="./imgs/sdxl_canny3.png" style="height:256px;" /> |
|
<img src="./imgs/sdxl_canny4.png" style="height:256px;" /> |
|
<img src="./imgs/sdxl_canny_vermeer.png" style="height:256px;" /> |
|
</div> |
|
|
|
## Intended uses & limitations |
|
|
|
#### How to use |
|
|
|
First clone the [control-lora-v3](https://github.com/HighCWu/control-lora-v3) and `cd` in the directory: |
|
```sh |
|
git clone https://github.com/HighCWu/control-lora-v3 |
|
cd control-lora-v3 |
|
``` |
|
|
|
Then run the python code。 |
|
|
|
For stable diffusion, use: |
|
|
|
```py |
|
# !pip install opencv-python transformers accelerate |
|
from diffusers import UniPCMultistepScheduler |
|
from diffusers.utils import load_image |
|
from model import UNet2DConditionModelEx |
|
from pipeline import StableDiffusionControlLoraV3Pipeline |
|
import numpy as np |
|
import torch |
|
|
|
import cv2 |
|
from PIL import Image |
|
|
|
# download an image |
|
image = load_image( |
|
"https://hf.co/datasets/huggingface/documentation-images/resolve/main/diffusers/input_image_vermeer.png" |
|
) |
|
image = np.array(image) |
|
|
|
# get canny image |
|
image = cv2.Canny(image, 100, 200) |
|
image = image[:, :, None] |
|
image = np.concatenate([image, image, image], axis=2) |
|
canny_image = Image.fromarray(image) |
|
|
|
# load stable diffusion v1-5 and control-lora-v3 |
|
unet: UNet2DConditionModelEx = UNet2DConditionModelEx.from_pretrained( |
|
"runwayml/stable-diffusion-v1-5", subfolder="unet", torch_dtype=torch.float16 |
|
) |
|
unet = unet.add_extra_conditions(["canny"]) |
|
pipe = StableDiffusionControlLoraV3Pipeline.from_pretrained( |
|
"runwayml/stable-diffusion-v1-5", unet=unet, torch_dtype=torch.float16 |
|
) |
|
# load attention processors |
|
# pipe.load_lora_weights("HighCWu/sd-control-lora-v3-canny") |
|
pipe.load_lora_weights("HighCWu/control-lora-v3", subfolder="sd-control-lora-v3-canny-half_skip_attn-rank16-conv_in-rank64") |
|
|
|
# speed up diffusion process with faster scheduler and memory optimization |
|
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) |
|
# remove following line if xformers is not installed |
|
pipe.enable_xformers_memory_efficient_attention() |
|
|
|
pipe.enable_model_cpu_offload() |
|
|
|
# generate image |
|
generator = torch.manual_seed(0) |
|
image = pipe( |
|
"futuristic-looking woman", num_inference_steps=20, generator=generator, image=canny_image |
|
).images[0] |
|
image.show() |
|
``` |
|
|
|
For stable diffusion xl, use: |
|
|
|
```py |
|
# !pip install opencv-python transformers accelerate |
|
from diffusers import AutoencoderKL |
|
from diffusers.utils import load_image |
|
from model import UNet2DConditionModelEx |
|
from pipeline_sdxl import StableDiffusionXLControlLoraV3Pipeline |
|
import numpy as np |
|
import torch |
|
|
|
import cv2 |
|
from PIL import Image |
|
|
|
prompt = "aerial view, a futuristic research complex in a bright foggy jungle, hard lighting" |
|
negative_prompt = "low quality, bad quality, sketches" |
|
|
|
# download an image |
|
image = load_image( |
|
"https://hf.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png" |
|
) |
|
|
|
# initialize the models and pipeline |
|
unet: UNet2DConditionModelEx = UNet2DConditionModelEx.from_pretrained( |
|
"stabilityai/stable-diffusion-xl-base-1.0", subfolder="unet", torch_dtype=torch.float16 |
|
) |
|
unet = unet.add_extra_conditions(["canny"]) |
|
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16) |
|
pipe = StableDiffusionXLControlLoraV3Pipeline.from_pretrained( |
|
"stabilityai/stable-diffusion-xl-base-1.0", unet=unet, vae=vae, torch_dtype=torch.float16 |
|
) |
|
# load attention processors |
|
# pipe.load_lora_weights("HighCWu/sdxl-control-lora-v3-canny") |
|
pipe.load_lora_weights("HighCWu/control-lora-v3", subfolder="sdxl-control-lora-v3-canny-half_skip_attn-rank16-conv_in-rank64") |
|
pipe.enable_model_cpu_offload() |
|
|
|
# get canny image |
|
image = np.array(image) |
|
image = cv2.Canny(image, 100, 200) |
|
image = image[:, :, None] |
|
image = np.concatenate([image, image, image], axis=2) |
|
canny_image = Image.fromarray(image) |
|
|
|
# generate image |
|
image = pipe( |
|
prompt, image=canny_image |
|
).images[0] |
|
image.show() |
|
``` |
|
|
|
#### Limitations and bias |
|
|
|
[TODO: provide examples of latent issues and potential remediations] |
|
|
|
## Training details |
|
|
|
[TODO: describe the data used to train the model] |