yichaodu's picture
Upload README.md with huggingface_hub
a499ea9 verified
|
raw
history blame
No virus
1.55 kB
---
tags:
- stable-diffusion
- stable-diffusion-diffusers
- text-to-image
inference: true
---
# Aligned Diffusion Model via DPO
Diffusion Model Aligned with thef following reward model and DPO algorithm
```
close-sourced vlm: claude3-opus gemini-1.5 gpt-4o gpt-4v
open-sourced vlm: internvl-1.5
score model: hps-2.1
```
## How to Use
You can load the model and perform inference as follows:
```python
from diffusers import StableDiffusionPipeline, UNet2DConditionModel
pretrained_model_name = "runwayml/stable-diffusion-v1-5"
dpo_unet = UNet2DConditionModel.from_pretrained(
"path/to/checkpoint",
subfolder='unet',
torch_dtype=torch.float16
).to('cuda')
pipeline = StableDiffusionPipeline.from_pretrained(pretrained_model_name, torch_dtype=torch.float16)
pipeline = pipeline.to('cuda')
pipeline.safety_checker = None
pipeline.unet = dpo_unet
generator = torch.Generator(device='cuda')
generator = generator.manual_seed(1)
prompt = "a pink flower"
image = pipeline(prompt=prompt, generator=generator, guidance_scale=gs).images[0]
```
## Citation
```
@misc{mjbench2024mjbench,
title={MJ-BENCH: Is Your Multimodal Reward Model Really a Good Judge?},
author={Chen*, Zhaorun and Du*, Yichao and Wen, Zichen and Zhou, Yiyang and Cui, Chenhang and Weng, Zhenzhen and Tu, Haoqin and Wang, Chaoqi and Tong, Zhengwei and HUANG, Leria and Chen, Canyu and Ye Qinghao and Zhu, Zhihong and Zhang, Yuqing and Zhou, Jiawei and Zhao, Zhuokai and Rafailov, Rafael and Finn, Chelsea and Yao, Huaxiu},
year={2024}
}
```