Edit model card


portrait photo of a girl, photograph, highly detailed face, depth of field, moody light, golden hour, style by Dan Winters, Russell James, Steve McCurry, centered, extremely detailed, Nikon D850, award winning photography
Self-portrait oil painting, a beautiful cyborg with golden hair, 8k
Astronaut in a jungle, cold color palette, muted colors, detailed, 8k
A photo of beautiful mountain with realistic sunset and blue lake, highly detailed, masterpiece

This is an experimental checkpoint and solely exists to validate if ORPO is possible on a diffusion model.

Model description

These are the LoRA weights for Diffusion ORPO. Diffusion ORPO is an effort to align a text-conditioned diffusion model on preference data without having to use a reference model. ORPO was originally proposed in [1].

Training was conducted using the script train_diffusion_orpo_sdxl_lora_wds.py which can be found here.

Full training command used:

accelerate launch --multi_gpu train_diffusion_orpo_sdxl_lora_wds.py \
  --pretrained_model_name_or_path=stabilityai/stable-diffusion-xl-base-1.0  \
  --pretrained_vae_model_name_or_path=madebyollin/sdxl-vae-fp16-fix \
  --dataset_path="pipe:aws s3 cp s3://diffusion-preference-opt/{00000..00644}.tar -" \
  --output_dir="diffusion-sdxl-orpo-wds" \
  --mixed_precision="fp16" \
  --gradient_accumulation_steps=1 \
  --gradient_checkpointing \
  --use_8bit_adam \
  --rank=8 \
  --dataloader_num_workers=8 \
  --learning_rate=3e-5 \
  --report_to="wandb" \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=50000 \
  --checkpointing_steps=2000 \
  --run_validation --validation_steps=500 \
  --seed="0" \
  --report_to="wandb" \

Refer here for more details on dataloading as it leverages webdataset for training.

Here's the corresponding run page on WandB. It gives a side-by-side comparison of the results with and without the Diffusion ORPO LoRA parameters:

Training details

It was trained on the training set of yuvalkirstain/pickapic_v2.

How to use

Make sure you have the latest versions of the libraries installed:

pip install -U diffusers accelerate transformers peft

And then run:

from diffusers import DiffusionPipeline
import torch

pipe_id = "stabilityai/stable-diffusion-xl-base-1.0"
pipe = DiffusionPipeline.from_pretrained(pipe_id, torch_dtype=torch.float16).to("cuda")

image = pipe("A high-quality photo of a spaceship that looks like the head of a horse.", num_inference_steps=30).images[0]

Refer to this guide to know more about LoRA inference in diffusers.


[1] ORPO: Monolithic Preference Optimization without Reference Model; Jiwoo Hong, Noah Lee, James Thorne; https://arxiv.org/abs/2403.07691.

Downloads last month

Adapter for