Instruction-tuned Stable Diffusion for Cartoonization (Fine-tuned)

This pipeline is an 'instruction-tuned' version of Stable Diffusion (v1.5). It was fine-tuned from the existing InstructPix2Pix checkpoints.

Pipeline description

Motivation behind this pipeline partly comes from FLAN and partly comes from InstructPix2Pix. The main idea is to first create an instruction prompted dataset (as described in our blog) and then conduct InstructPix2Pix style training. The end objective is to make Stable Diffusion better at following specific instructions that entail image transformation related operations.

Follow this post to know more.

Training procedure and results

Training was conducted on instruction-tuning-sd/cartoonization dataset. Refer to this repository to know more. The training logs can be found here.

Here are some results dervied from the pipeline:

Intended uses & limitations

You can use the pipeline for performing cartoonization with an input image and an input prompt.

How to use

Here is how to use this model:

import torch
from diffusers import StableDiffusionInstructPix2PixPipeline
from diffusers.utils import load_image

model_id = "instruction-tuning-sd/cartoonizer"
pipeline = StableDiffusionInstructPix2PixPipeline.from_pretrained(
    model_id, torch_dtype=torch.float16, use_auth_token=True

image_path = "https://hf.co/datasets/diffusers/diffusers-images-docs/resolve/main/mountain.png"
image = load_image(image_path)

image = pipeline("Cartoonize the following image", image=image).images[0]

For notes on limitations, misuse, malicious use, out-of-scope use, please refer to the model card here.



