Model does not work in ComfyUI

#6
by feiyuuu - opened

image.png
This is the result in ComfyUI, the top image is without this controlnet and the bottom image is with it. They do not have much difference. The base model is animagineXL_v3.1, no lora used.

I am not sure whether you follow the inference code correctly,the controlnet strength should set to 1.0. You can run the code in python first to check whether it is well.

image.png
the model has been check by many people in twitter so I think you ask someone for help

Hi, again, I'm coming backing to this issue. I did another test. This is the original image without controlnet and the pose image.

屏幕截图 2024-06-25 095859.png
For this model: https://huggingface.co/thibaud/controlnet-openpose-sdxl-1.0, the two poses both work, but for your model, the left pose works but the right one not.

屏幕截图 2024-06-25 095954.png
The left two images are corresponding to the above model and the right two images are corresponding to yours. You can see the last image did not be controlled.
And besides, for the third image, from the person's armor can see, the style is not consistent.
I did some more tests. For your model, only part of the poses work. I'm definitely sure all the parameters are the same. I just changed the model and the pose image. Any idea about this?@xinsir

Can you send me the control image and prompt to me, I think there may be some dis-alighment in the pre-process in comfyui. I have not test it in comfyui, I test it offline with multiple models, including xl-base, counterfeit, blue pencil and so on. We test 2000+ images with ground truth annotations and calculate the mAP like COCO seriously, It it unusual it the performance below t2i or thibaud, because in the offline test, the model achieves 10mAP higher than the two models. The most precise model in open-source community is the original auther lvming zhang SD1.5 before, I will learn the comfyui and check it. Perhaps I need some time, Recently I will release other two controlnet-models. If you have comfyui scripts, you can send it to me either.

Or, you can use the instruction code to run it, with the same control image and prompt. Recently I am busy with other controlnet model release, perhaps I need some time to help you check the comfyui code.

Hello, I think I found the reason, you can try as the following:

find the path to you virtual environment that you install controlnet_aux, it is something like /home/zhangsan/anaconda3/envs/diffusers/lib/python3.8/site-packages/controlnet_aux/open_pose/util.py

Replace the draw_bodypose function with the following code:

def draw_bodypose(canvas: np.ndarray, keypoints: List[Keypoint]) -> np.ndarray:
    """
    Draw keypoints and limbs representing body pose on a given canvas.

    Args:
        canvas (np.ndarray): A 3D numpy array representing the canvas (image) on which to draw the body pose.
        keypoints (List[Keypoint]): A list of Keypoint objects representing the body keypoints to be drawn.

    Returns:
        np.ndarray: A 3D numpy array representing the modified canvas with the drawn body pose.

    Note:
        The function expects the x and y coordinates of the keypoints to be normalized between 0 and 1.
    """
    H, W, C = canvas.shape

    
    if max(W, H) < 500:
        ratio = 1.0
    elif max(W, H) >= 500 and max(W, H) < 1000:
        ratio = 2.0
    elif max(W, H) >= 1000 and max(W, H) < 2000:
        ratio = 3.0
    elif max(W, H) >= 2000 and max(W, H) < 3000:
        ratio = 4.0
    elif max(W, H) >= 3000 and max(W, H) < 4000:
        ratio = 5.0
    elif max(W, H) >= 4000 and max(W, H) < 5000:
        ratio = 6.0
    else:
        ratio = 7.0

    stickwidth = 4

    limbSeq = [
        [2, 3], [2, 6], [3, 4], [4, 5], 
        [6, 7], [7, 8], [2, 9], [9, 10], 
        [10, 11], [2, 12], [12, 13], [13, 14], 
        [2, 1], [1, 15], [15, 17], [1, 16], 
        [16, 18],
    ]

    colors = [[255, 0, 0], [255, 85, 0], [255, 170, 0], [255, 255, 0], [170, 255, 0], [85, 255, 0], [0, 255, 0], \
              [0, 255, 85], [0, 255, 170], [0, 255, 255], [0, 170, 255], [0, 85, 255], [0, 0, 255], [85, 0, 255], \
              [170, 0, 255], [255, 0, 255], [255, 0, 170], [255, 0, 85]]

    for (k1_index, k2_index), color in zip(limbSeq, colors):
        keypoint1 = keypoints[k1_index - 1]
        keypoint2 = keypoints[k2_index - 1]

        if keypoint1 is None or keypoint2 is None:
            continue

        Y = np.array([keypoint1.x, keypoint2.x]) * float(W)
        X = np.array([keypoint1.y, keypoint2.y]) * float(H)
        mX = np.mean(X)
        mY = np.mean(Y)
        length = ((X[0] - X[1]) ** 2 + (Y[0] - Y[1]) ** 2) ** 0.5
        angle = math.degrees(math.atan2(X[0] - X[1], Y[0] - Y[1]))
        polygon = cv2.ellipse2Poly((int(mY), int(mX)), (int(length / 2), int(stickwidth * ratio)), int(angle), 0, 360, 1)
        cv2.fillConvexPoly(canvas, polygon, [int(float(c) * 0.6) for c in color])

    for keypoint, color in zip(keypoints, colors):
        if keypoint is None:
            continue

        x, y = keypoint.x, keypoint.y
        x = int(x * W)
        y = int(y * H)
        cv2.circle(canvas, (int(x), int(y)), int(4 * ratio), color, thickness=-1)

    return canvas

Thank you for you using and report, I will update it in the model card later.

The is the pose I regenerated (using the image you provide in first answer) and the openpose image using the new pose image, I think this can solve your problem. Enjoy it!
source pose
111.jpg
detect pose
image.png
controlnet generation
image.png

Hi, something I just don't understand.

  1. Your code enhances the stick's width according the image size, why you do that? Is this a common way for control net openpose? Or it's because your model requires this.
  2. Can you explain for me that why control net needs to detect the openpose? Since it already has a openpose image input, and it's from the preprocessor. Just dont know how it works.

1 I enhances the stick's width, because the train data for SDXL(1024 * 1024) has a higher resolution for SD1.5(512 * 512), I generate pose in higher resolution to increase the pose tag accuracy. If using the default openpose width setting, the skeleton is not obvious in higher resolution images, such as 2K or higher. increase the width according to the resolution is reasonable to get a similar visual effect and increase the generation ability.
2 The image you send to the openpose detector is RGB image, it is not the pose image, pose image is detected by openpose model according to you RGB image. The code I provide you is when you have a RGB image, and use openpose detector to generate the pose skeleton, use the code I provide can generate thick pose skeleton. I mean you regenerate you pose skeleton image, not using the image before. In this way, you can fix the problem.

My file path is D: \ ComfyUI aki v1.1 \ Python \ Lib \ site packages \ controllnet.aux \ open_pose. I have changed the code in the util. py file, but openpose still cannot run properly. I am running it in comfyUI.

just like this!
图片.png

use the example code to run first? detect hand should be set to false. If the image is 1024, the openpose line you draw seems not enhanced.
ComfyUI has its own controllnet_aux package, you should change this if you want to use comfyUI.
image.png

image.png

This is the fix, and thankfully if you use the comfyui_controlnet_aux DWPose Estimator, You won't have to monkeypatch the code! just toggle (scale_stick_for_xinsr_cn) on!

image.png

Sign up or log in to comment