What kind of preprocessing is required for this? Grayscale depth maps do not appear to work.

#7
by nmkd - opened

I've fiddled with ZoeDepth but can't get it to give me the kind of depth map that this ControlNet expects.
The 1.5 model just uses grayscale maps.

from the examples it looks like a RGB image with the value from the channels in GRB order being treating as a single 24 bit value, black being closest white being farthest away, value wise thats the inverse of the 'inferno' depth maps that MiDaS outputs. Can't think of a clever way to invert the values at the moment other than doing all the maths.

Have you tried the example images, do they work ?

from the examples it looks like a RGB image with the value from the channels in GRB order being treating as a single 24 bit value, black being closest white being farthest away, value wise thats the inverse of the 'inferno' depth maps that MiDaS outputs. Can't think of a clever way to invert the values at the moment other than doing all the maths.

Have you tried the example images, do they work ?

Yes, the examples work, but I can't get my own inputs working.

@Vargol thanks, I managed to make it work by adding the following lines to ComfyUI\custom_nodes\comfy_controlnet_preprocessors\v11\zoe\__init__.py:

            import matplotlib.pyplot as plt
            colored_depth = plt.cm.inferno(depth) # Get "inferno" map
            depth_image = (colored_depth[:, :, :3] * 255).astype(np.uint8)
            depth_image = 255 - depth_image # Invert "inferno" palette

right before return depth_image. The preprocessor now returns an inverted inferno map, which is what this model appears to be trained on.

It works:

image.png

Sign up or log in to comment