control-lora-v3 / README.md

HighCWu

Update README.md

c5d2863 verified 2 months ago

preview code

raw

history blame contribute delete

No virus

7.71 kB

	---
	base_model: runwayml/stable-diffusion-v1-5
	library_name: diffusers
	license: creativeml-openrail-m
	tags:
	- stable-diffusion
	- stable-diffusion-diffusers
	- text-to-image
	- diffusers
	- controlnet
	- control-lora-v3
	- diffusers-training
	inference: true
	---

	<!-- This model card has been generated automatically according to the information the training script had access to. You
	should probably proofread and complete it, then remove this comment. -->


	# ControlLoRA Version 3 Pretrained Models Collection

	This is a collections of control-lora-v3 weights trained on runwayml/stable-diffusion-v1-5 and stabilityai/stable-diffusion-xl-base-1.0 with different types of conditioning.
	You can find some example images below.

	## Stable Diffusion

	### Canny
	<div style="display: flex; flex-wrap: wrap;">
	<img src="./imgs/canny1.png" style="height:256px;" />
	<img src="./imgs/canny2.png" style="height:256px;" />
	<img src="./imgs/canny3.png" style="height:256px;" />
	<img src="./imgs/canny4.png" style="height:256px;" />
	<img src="./imgs/canny_vermeer.png" style="height:256px;" />
	</div>

	### OpenPose + Segmentation

	This is experimental, and it doesn't work well.

	<div style="display: flex; flex-wrap: wrap;">
	<img src="./imgs/pose5_segmentation5.png" style="height:256px;" />
	<img src="./imgs/pose6_segmentation6.png" style="height:256px;" />
	<img src="./imgs/pose7_segmentation7.png" style="height:256px;" />
	<img src="./imgs/pose8_segmentation8.png" style="height:256px;" />
	</div>

	### Depth
	<div style="display: flex; flex-wrap: wrap;">
	<img src="./imgs/depth1.png" style="height:256px;" />
	<img src="./imgs/depth2.png" style="height:256px;" />
	<img src="./imgs/depth3.png" style="height:256px;" />
	<img src="./imgs/depth4.png" style="height:256px;" />
	</div>

	### Normal map
	<div style="display: flex; flex-wrap: wrap;">
	<img src="./imgs/normal1.png" style="height:256px;" />
	<img src="./imgs/normal2.png" style="height:256px;" />
	<img src="./imgs/normal3.png" style="height:256px;" />
	<img src="./imgs/normal4.png" style="height:256px;" />
	</div>

	### OpenPose
	<div style="display: flex; flex-wrap: wrap;">
	<img src="./imgs/pose1.png" style="height:256px;" />
	<img src="./imgs/pose2.png" style="height:256px;" />
	<img src="./imgs/pose3.png" style="height:256px;" />
	<img src="./imgs/pose4.png" style="height:256px;" />
	<img src="./imgs/pose5.png" style="height:256px;" />
	<img src="./imgs/pose6.png" style="height:256px;" />
	<img src="./imgs/pose7.png" style="height:256px;" />
	<img src="./imgs/pose8.png" style="height:256px;" />
	</div>

	### Segmentation

	<div style="display: flex; flex-wrap: wrap;">
	<img src="./imgs/segmentation1.png" style="height:256px;" />
	<img src="./imgs/segmentation2.png" style="height:256px;" />
	<img src="./imgs/segmentation3.png" style="height:256px;" />
	<img src="./imgs/segmentation4.png" style="height:256px;" />
	<img src="./imgs/segmentation5.png" style="height:256px;" />
	<img src="./imgs/segmentation6.png" style="height:256px;" />
	<img src="./imgs/segmentation7.png" style="height:256px;" />
	<img src="./imgs/segmentation8.png" style="height:256px;" />
	</div>

	### Tile
	<div style="display: flex; flex-wrap: wrap;">
	<img src="./imgs/tile1.png" style="height:256px;" />
	<img src="./imgs/tile2.png" style="height:256px;" />
	<img src="./imgs/tile3.png" style="height:256px;" />
	<img src="./imgs/tile4.png" style="height:256px;" />
	</div>

	## Stable Diffusion XL

	### Canny
	<div style="display: flex;">
	<img src="./imgs/sdxl_canny1.png" style="height:256px;" />
	<img src="./imgs/sdxl_canny2.png" style="height:256px;" />
	<img src="./imgs/sdxl_canny3.png" style="height:256px;" />
	<img src="./imgs/sdxl_canny4.png" style="height:256px;" />
	<img src="./imgs/sdxl_canny_vermeer.png" style="height:256px;" />
	</div>

	## Intended uses & limitations

	#### How to use

	First clone the [control-lora-v3](https://github.com/HighCWu/control-lora-v3) and `cd` in the directory:
	```sh
	git clone https://github.com/HighCWu/control-lora-v3
	cd control-lora-v3
	```

	Then run the python code。

	For stable diffusion, use:

	```py
	# !pip install opencv-python transformers accelerate
	from diffusers import UniPCMultistepScheduler
	from diffusers.utils import load_image
	from model import UNet2DConditionModelEx
	from pipeline import StableDiffusionControlLoraV3Pipeline
	import numpy as np
	import torch

	import cv2
	from PIL import Image

	# download an image
	image = load_image(
	"https://hf.co/datasets/huggingface/documentation-images/resolve/main/diffusers/input_image_vermeer.png"
	)
	image = np.array(image)

	# get canny image
	image = cv2.Canny(image, 100, 200)
	image = image[:, :, None]
	image = np.concatenate([image, image, image], axis=2)
	canny_image = Image.fromarray(image)

	# load stable diffusion v1-5 and control-lora-v3
	unet: UNet2DConditionModelEx = UNet2DConditionModelEx.from_pretrained(
	"runwayml/stable-diffusion-v1-5", subfolder="unet", torch_dtype=torch.float16
	)
	unet = unet.add_extra_conditions(["canny"])
	pipe = StableDiffusionControlLoraV3Pipeline.from_pretrained(
	"runwayml/stable-diffusion-v1-5", unet=unet, torch_dtype=torch.float16
	)
	# load attention processors
	# pipe.load_lora_weights("HighCWu/sd-control-lora-v3-canny")
	pipe.load_lora_weights("HighCWu/control-lora-v3", subfolder="sd-control-lora-v3-canny-half_skip_attn-rank16-conv_in-rank64")

	# speed up diffusion process with faster scheduler and memory optimization
	pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
	# remove following line if xformers is not installed
	pipe.enable_xformers_memory_efficient_attention()

	pipe.enable_model_cpu_offload()

	# generate image
	generator = torch.manual_seed(0)
	image = pipe(
	"futuristic-looking woman", num_inference_steps=20, generator=generator, image=canny_image
	).images[0]
	image.show()
	```

	For stable diffusion xl, use:

	```py
	# !pip install opencv-python transformers accelerate
	from diffusers import AutoencoderKL
	from diffusers.utils import load_image
	from model import UNet2DConditionModelEx
	from pipeline_sdxl import StableDiffusionXLControlLoraV3Pipeline
	import numpy as np
	import torch

	import cv2
	from PIL import Image

	prompt = "aerial view, a futuristic research complex in a bright foggy jungle, hard lighting"
	negative_prompt = "low quality, bad quality, sketches"

	# download an image
	image = load_image(
	"https://hf.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png"
	)

	# initialize the models and pipeline
	unet: UNet2DConditionModelEx = UNet2DConditionModelEx.from_pretrained(
	"stabilityai/stable-diffusion-xl-base-1.0", subfolder="unet", torch_dtype=torch.float16
	)
	unet = unet.add_extra_conditions(["canny"])
	vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
	pipe = StableDiffusionXLControlLoraV3Pipeline.from_pretrained(
	"stabilityai/stable-diffusion-xl-base-1.0", unet=unet, vae=vae, torch_dtype=torch.float16
	)
	# load attention processors
	# pipe.load_lora_weights("HighCWu/sdxl-control-lora-v3-canny")
	pipe.load_lora_weights("HighCWu/control-lora-v3", subfolder="sdxl-control-lora-v3-canny-half_skip_attn-rank16-conv_in-rank64")
	pipe.enable_model_cpu_offload()

	# get canny image
	image = np.array(image)
	image = cv2.Canny(image, 100, 200)
	image = image[:, :, None]
	image = np.concatenate([image, image, image], axis=2)
	canny_image = Image.fromarray(image)

	# generate image
	image = pipe(
	prompt, image=canny_image
	).images[0]
	image.show()
	```

	#### Limitations and bias

	[TODO: provide examples of latent issues and potential remediations]

	## Training details

	[TODO: describe the data used to train the model]