PixelDiT ControlNet + IP-Adapter

ControlNet scribble conditioning and IP-Adapter style transfer for PixelDiT-1300M.

Note: PixelDiT-1300M is a model by NVIDIA Research. This repo contains trained adapters only โ€” we are not affiliated with NVIDIA.

Files

File Description
controlnet.safetensors Combined ControlNet (7 blocks) + IP-Adapter weights
ip_adapter.safetensors IP-Adapter weights only
hed_detector.safetensors HED edge detector (Apache-2.0, VGG-based)
config.json Model config
train.py Joint ControlNet + IP-Adapter training script
precompute_wd_tags.py Run WD tagger on dataset โ†’ wd_tags.json
precompute_embeddings.py Encode images with SigLIP + Gemma โ†’ memmap files
precompute_hed.py Precompute HED edge maps for a dataset
control_maps.py Edge map post-processing utilities
hed.py HED model definition
convert_to_safetensors.py Convert .pt checkpoints to safetensors

Usage

from diffusers.pipelines.pixeldit import PixelDiTStyledPipeline
from huggingface_hub import hf_hub_download
from PIL import Image
import torch

pipe = PixelDiTStyledPipeline.from_pretrained_styled(
    "madtune/pixeldit-diffusers",
    controlnet_path=hf_hub_download("madtune/pixeldit-controlnet", "controlnet.safetensors"),
    ip_adapter_path=hf_hub_download("madtune/pixeldit-controlnet", "ip_adapter.safetensors"),
    hed_ckpt_path=hf_hub_download("madtune/pixeldit-controlnet", "hed_detector.safetensors"),
    torch_dtype=torch.bfloat16,
)
pipe.enable_model_cpu_offload(gpu_id=1)

out = pipe(
    image=Image.open("style_ref.jpg"),
    prompt="gothic pale woman, dramatic rim lighting",
    variation_strength=0.85,
    ctrl_strength=0.25,
    ip_strength=0.85,
    flow_shift=8.0,
    guidance_scale=4.5,
    num_inference_steps=50,
).images[0]
out.save("output.jpg")

Recommended settings

Mode ctrl_strength ip_strength variation_strength
Pure variation 0.0 0.0 0.65โ€“0.85
ControlNet only 0.25 0.0 0.85
IP-Adapter only 0.0 0.85 0.85
Full combo (best) 0.25 0.35โ€“0.85 0.85

flow_shift=8.0 + guidance_scale=3.0โ€“3.5 works well at 768px+. 4.5 is valid but produces oversaturated colours.

Downloads last month
32
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for madtune/pixeldit-controlnet

Adapter
(1)
this model