kvablack
/

ddpo-alignment

StableDiffusionPipeline

stable-diffusion

stable-diffusion-diffusers

Inference Endpoints

Model card Files Files and versions Community

kvablack commited on May 26, 2023

Commit

3ce0e3a

•

1 Parent(s): 9ee26ab

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ inference:
 This model was finetuned from [Stable Diffusion v1-5](https:/runwayml/stable-diffusion-v1-5) using [DDPO](https://arxiv.org/abs/2305.13301) and a reward function that uses [LLaVA](https://llava-vl.github.io/) to measure prompt-image alignment. See [the project website](https://rl-diffusion.github.io/) for more details.
-The model was finetuned for 120 iterations with a batch size of 256 samples per iteration. During finetuning, we used prompts of the form: "_a(n) \<animal\> \<activity\>_". We selected the animal and activity from the following lists, so try those for the best results. However, we also observed limited generalization to other prompts.
 Activities:
 - washing dishes

 This model was finetuned from [Stable Diffusion v1-5](https:/runwayml/stable-diffusion-v1-5) using [DDPO](https://arxiv.org/abs/2305.13301) and a reward function that uses [LLaVA](https://llava-vl.github.io/) to measure prompt-image alignment. See [the project website](https://rl-diffusion.github.io/) for more details.
+The model was finetuned for 200 iterations with a batch size of 256 samples per iteration. During finetuning, we used prompts of the form: "_a(n) \<animal\> \<activity\>_". We selected the animal and activity from the following lists, so try those for the best results. However, we also observed limited generalization to other prompts.
 Activities:
 - washing dishes