Why do I use the sd1.5 model, combined with lllyasviel/controll_v11p_sd15_cannoy's controllnet, to generate poor image quality,

#1
by michaelj - opened

Why do I use the sd1.5 model, combined with lllyasviel/controll_v11p_sd15_cannoy's controllnet, to generate poor image quality,

ByteDance org

Thanks for your attention! We have updated the Controlnet Usage , you can refer to the examples we provided.

Thanks for your attention! We have updated the Controlnet Usage , you can refer to the examples we provided.

The quality of the pictures I generate using lora in webui and comfyui is relatively poor. How do I need to adjust the parameters? Is there a workflow suitable for comfyui?

Thank you for your contribution, but when I used your example, the quality of the generated images was still not very good. I only changed the image address, and the other parameters were the same. For example, when I used the image of this blank room, the quality of the generated images was much worse than that generated directly by Canny. I adjusted the step size and other parameters, but they still did not work. What is the problem,https://img.mp.itc.cn/upload/20161130/937a6e38645046e6956f8e317ca04ab0_th.jpg

Thank you for your contribution, but when I used your example, the quality of the generated images was still not very good. I only changed the image address, and the other parameters were the same. For example, when I used the image of this blank room, the quality of the generated images was much worse than that generated directly by Canny. I adjusted the step size and other parameters, but they still did not work. What is the problem,https://img.mp.itc.cn/upload/20161130/937a6e38645046e6956f8e317ca04ab0_th.jpg

Thank you for your contribution, but when I used your example, the quality of the generated images was still not very good. I only changed the image address, and the other parameters were the same. For example, when I used the image of this blank room, the quality of the generated images was much worse than that generated directly by Canny. I adjusted the step size and other parameters, but they still did not work. What is the problem,https://img.mp.itc.cn/upload/20161130/937a6e38645046e6956f8e317ca04ab0_th.jpg

@michaelj
Hi, Can you provide the image you generated so we can check it for you?

ByteDance org

Thanks for your attention! We have updated the Controlnet Usage , you can refer to the examples we provided.

The quality of the pictures I generate using lora in webui and comfyui is relatively poor. How do I need to adjust the parameters? Is there a workflow suitable for comfyui?

Thanks for your attention, we have uploaded comfyui workflows for HyperSD-LoRAs. And we are still working on the workflow for 1-step Unet, which will be updated as soon as possible.

this my code i just change the image url
import torch
from diffusers.utils import load_image
import numpy as np
import cv2
from PIL import Image
from diffusers import ControlNetModel, StableDiffusionControlNetPipeline, TCDScheduler
from huggingface_hub import hf_hub_download

controlnet_checkpoint = "lllyasviel/control_v11p_sd15_canny"
def load_and_resize_image(url, target_size):
# 加载图片
image = load_image(url)

# 将 PIL Image 转换为 numpy array,便于后续处理
image_array = np.array(image)

# 获取原图宽高比
original_width, original_height = image_array.shape[:2]
aspect_ratio = original_width / original_height

# 计算新的高度以保持宽高比,假设我们先确定宽度为512像素
new_width = target_size
new_height = int(new_width / aspect_ratio)

# 如果新高度超过目标大小,则重新计算宽度以保持目标高度
if new_height > target_size:
    new_height = target_size
    new_width = int(new_height * aspect_ratio)

# 等比例缩放图片
resized_image = Image.fromarray(image_array).resize((new_width, new_height), resample=Image.LANCZOS)
resized_image.save("resize.png")
return resized_image

Load original image

image = load_and_resize_image("https://hf-mirror.com/lllyasviel/control_v11p_sd15_canny/resolve/main/images/input.png",512)

image = load_image("https://tse3-mm.cn.bing.net/th/id/OIP-C.GdC1ep4Jc3R5nZSUfJlacgHaFj?rs=1&pid=ImgDetMain")

image = np.array(image)

Prepare Canny Control Image

low_threshold = 100
high_threshold = 200
image = cv2.Canny(image, low_threshold, high_threshold)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
control_image = Image.fromarray(image)
control_image.save("control.png")

Initialize pipeline

controlnet = ControlNetModel.from_pretrained(controlnet_checkpoint, torch_dtype=torch.float16)
pipe = StableDiffusionControlNetPipeline.from_pretrained("Lykon/dreamshaper-8", controlnet=controlnet, torch_dtype=torch.float16).to("cuda")

Load Hyper-SD15-1step lora

pipe.load_lora_weights(hf_hub_download("ByteDance/Hyper-SD", "Hyper-SD15-1step-lora.safetensors"))
pipe.fuse_lora()

Use TCD scheduler to achieve better image quality

pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)

Lower eta results in more detail for multi-steps inference

eta=1.0
image = pipe("a beautiful room", num_inference_steps=4, image=control_image, guidance_scale=1, eta=eta).images[0]
image.save('image_out2.png')

image_out2.png

@michaelj Hi, we just check the control image on your input yielded by cv2.Canny and found the edges / structure of the room are not recognized well.
So your pipeline is okay and unfortunately this result was expected. You can try other input images that are clearer or try another SD15 (e.g. https://huggingface.co/spaces/radames/InstantStyle-Hyper-SD) or SDXL demos (e.g. https://huggingface.co/spaces/radames/InstantStyle-Hyper-SDXL, https://huggingface.co/spaces/ByteDance/Hyper-SDXL-1Step-T2I) , thank you for your attention!
control.png

renyuxi changed discussion status to closed

Sign up or log in to comment