sayakpaul/diffusion-sdxl-orpo
This is an experimental checkpoint and solely exists to validate if ORPO is possible on a diffusion model.
Model description
These are the LoRA weights for Diffusion ORPO. Diffusion ORPO is an effort to align a text-conditioned diffusion model on preference data without having to use a reference model. ORPO was originally proposed in [1].
Training was conducted using the script train_diffusion_orpo_sdxl_lora.py
proposed in
this PR.
Full training command used:
accelerate launch train_diffusion_orpo_sdxl_lora.py \
--pretrained_model_name_or_path=stabilityai/stable-diffusion-xl-base-1.0 \
--pretrained_vae_model_name_or_path=madebyollin/sdxl-vae-fp16-fix \
--output_dir="diffusion-sdxl-orpo" \
--mixed_precision="fp16" \
--dataset_name=kashif/pickascore \
--train_batch_size=8 \
--gradient_accumulation_steps=2 \
--gradient_checkpointing \
--use_8bit_adam \
--rank=8 \
--learning_rate=1e-5 \
--report_to="wandb" \
--lr_scheduler="constant" \
--lr_warmup_steps=0 \
--max_train_steps=2000 \
--checkpointing_steps=500 \
--run_validation --validation_steps=50 \
--seed="0" \
--report_to="wandb" \
--push_to_hub
Here's the corresponding run page on WandB. It gives a side-by-side comparison of the results with and without the Diffusion ORPO LoRA parameters:
Training details
It was trained on a subset of yuvalkirstain/pickapic_v2
-- kashif/pickascore
.
How to use
Make sure you have the latest versions of the libraries installed:
pip install -U diffusers accelerate transformers peft
And then run:
from diffusers import DiffusionPipeline
import torch
pipe_id = "stabilityai/stable-diffusion-xl-base-1.0"
pipe = DiffusionPipeline.from_pretrained(pipe_id, torch_dtype=torch.float16).to("cuda")
pipe.load_lora_weights("sayakpaul/diffusion-sdxl-orpo")
image = pipe("A high-quality photo of a spaceship that looks like the head of a horse.", num_inference_steps=30).images[0]
image
Refer to this guide to know more about LoRA
inference in diffusers
.
Citations
[1] ORPO: Monolithic Preference Optimization without Reference Model; Jiwoo Hong, Noah Lee, James Thorne; https://arxiv.org/abs/2403.07691.
- Downloads last month
- 454