Modular ChronoEdit

Modular implementation of nvidia/ChronoEdit-14B-Diffusers.

Code

Unfold

"""
Mimicked from https://huggingface.co/spaces/nvidia/ChronoEdit/blob/main/app.py
"""

from diffusers.modular_pipelines import WanModularPipeline, ModularPipelineBlocks
from diffusers.utils import load_image
from diffusers import UniPCMultistepScheduler
import torch
from PIL import Image

repo_id = "diffusers-internal-dev/chronoedit-modular"
blocks = ModularPipelineBlocks.from_pretrained(repo_id, trust_remote_code=True)
pipe = WanModularPipeline(blocks, repo_id)
pipe.load_components(
    trust_remote_code=True,
    device_map="cuda",
    torch_dtype={"default": torch.bfloat16, "image_encoder": torch.float32},
)
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config, flow_shift=2.0)
pipe.load_lora_weights("nvidia/ChronoEdit-14B-Diffusers", weight_name="lora/chronoedit_distill_lora.safetensors")
pipe.fuse_lora(lora_scale=1.0)

image = load_image("https://huggingface.co/spaces/nvidia/ChronoEdit/resolve/main/examples/3.png")
prompt = "Transform the image so that inside the floral teacup of steaming tea, a small, cute mouse is sitting and taking a bath; the mouse should look relaxed and cheerful, with a tiny white bath towel draped over its head as if enjoying a spa moment, while the steam rises gently around it, blending seamlessly with the warm and cozy atmosphere."

# image is resized within the pipeline unlike https://huggingface.co/spaces/nvidia/ChronoEdit/blob/main/app.py#L151
# refer to `ChronoEditImageInputStep`.
out = pipe(
    image=image,
    prompt=prompt,  # todo: enhance prompt
    num_inference_steps=8,  # todo: implement temporal reasoning
    num_frames=5,  # https://huggingface.co/spaces/nvidia/ChronoEdit/blob/main/app.py#L152
    output_type="np",
    generator=torch.manual_seed(0),
)
frames = out.values["videos"][0]
Image.fromarray((frames[-1] * 255).clip(0, 255).astype("uint8")).save("demo.png")

You can find it here too.

Make sure diffusers is installed from source: pip install git+https://github.com/huggingface/diffusers.

Results

Transform the image so that inside the floral teacup of steaming tea, a small, cute mouse is sitting and taking a bath; the mouse should look relaxed and cheerful, with a tiny white bath towel draped over its head as if enjoying a spa moment, while the steam rises gently around it, blending seamlessly with the warm and cozy atmosphere.

Notes

This implementation doesn't have temporal reasoning.
This doesn't use a separate prompt enhancer model.

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support