InstantX
/

SD3-Controlnet-Canny

Model card Files Files and versions Community

SD3-Controlnet-Canny / README.md

wanghaofan's picture

Update README.md

c164579 verified 4 months ago

|

No virus

2.11 kB

	# SD3 Controlnet




	\| control image \| weight=0.0 \| weight=0.3 \| weight=0.5 \| weight=0.7 \| weight=0.9 \|
	\|:-------------------------:\|:-------------------------:\|:-------------------------:\|:-------------------------:\|:-------------------------:\|:-------------------------:\|
	\|<img src="./canny.jpg" width = "400" /> \| <img src="./demo_0.jpg" width = "400" /> \| <img src="./demo_3.jpg" width = "400" /> \| <img src="./demo_5.jpg" width = "400" /> \| <img src="./demo_7.jpg" width = "400" /> \| <img src="./demo_9.jpg" width = "400" /> \|


	# Install Diffusers-SD3-Controlnet

	The current [diffusers](https://github.com/instantX-research/diffusers_sd3_control.git) have not been merged into the official code yet.

	```cmd
	git clone -b sd3_control https://github.com/instantX-research/diffusers_sd3_control.git
	cd diffusers_sd3_control
	pip install -e .
	```

	# Demo
	```python
	import torch
	from diffusers import StableDiffusion3ControlNetPipeline
	from diffusers.models import SD3ControlNetModel, SD3MultiControlNetModel
	from diffusers.utils import load_image

	# load pipeline
	controlnet = SD3ControlNetModel.from_pretrained("InstantX/SD3-Controlnet-Canny")
	pipe = StableDiffusion3ControlNetPipeline.from_pretrained(
	"stabilityai/stable-diffusion-3-medium-diffusers",
	controlnet=controlnet
	)
	pipe.to("cuda", torch.float16)

	# config
	control_image = load_image("https://huggingface.co/InstantX/SD3-Controlnet-Canny/resolve/main/canny.jpg")
	prompt = 'Anime style illustration of a girl wearing a suit. A moon in sky. In the background we see a big rain approaching. text "InstantX" on image'
	n_prompt = 'NSFW, nude, naked, porn, ugly'
	image = pipe(
	prompt,
	negative_prompt=n_prompt,
	control_image=control_image,
	controlnet_conditioning_scale=0.5,
	).images[0]
	image.save('image.jpg')
	```


	## Limitation
	Due to the fact that only 1024*1024 pixel resolution was used during the training phase,
	the inference performs best at this size, with other sizes yielding suboptimal results.
	We will initiate multi-resolution training in the future, and at that time, we will open-source the new weights.