Text-to-Image
ZeroDiffusion / README.md
drhead's picture
Update README.md
1c718bb
|
raw
history blame
1.63 kB
metadata
license: creativeml-openrail-m
datasets:
  - ChristophSchuhmann/improved_aesthetics_6plus
  - drhead/laion_hd_21M_deduped
pipeline_tag: text-to-image

Currently released models:

ZeroDiffusion-Base v0.9 (zd_base_v0-9 and zd_base_v0-9_ema) - a base model trained on zero terminal SNR over roughly 20 million samples

ZeroDiffusion-Inpaint v0.9 (zd_inpaint_v0-9 and zd_inpaint_v0-9_ema) - an experimental finetune of the stable-diffusion-inpainting model, initialized from a merge of ZD 0.9

This is a work in progress model trained off of SD 1.5 with zero terminal SNR.

ZeroDiffusion v0.9 is intended as a final prototype made from a complete training run.

ZeroDiffusion v1.0 will most likely start training if/when SD 1.6 model weights are released.

The intention of this model is to provide a training base for other models, and to provide researchers with a clean model base to test zero terminal SNR with.

For this model to work well, you will probably need CFG rescale, which is implemented in this plugin: https://github.com/Seshelle/CFG_Rescale_webui

Dynamic Thresholding is another potential alternative to CFG rescale which on the right settings will stabilize images and also not cause the brownout often caused by CFG rescale, however it will require more tweaking to work. Get Dynamic Thresholding for A1111 here: https://github.com/mcmonkeyprojects/sd-dynamic-thresholding/

You must also download the corresponding YAML file and put it in the folder with the model (assuming you are using A1111's webui or similar). It won't work without it. It will tell webui to use the model in v-prediction mode.