Stable Diffusion UniControl 1.1 Model Card

Stable Diffusion UniControl is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input.

You can use this with the 🧨Diffusers Plus Plus library., our fork of diffusers.

Note: For now, please install diffusers_plus_plus from github. Specifically the unicontrol branch to access UniControl Pipeline

Original GitHub Repository

Original Code from authors here.

Model Details

Developed by: Qin, Can and Zhang, Shu and Yu, Ning and Feng, Yihao and Yang, Xinyi and Zhou, Yingbo and Wang, Huan and Niebles, Juan Carlos and Xiong, Caiming and Savarese, Silvio and others
Model type: Diffusion-based controlnet for text-to-image generation model with multiple conditionings
Language(s): English
Model Description: This is a model that can be used to generate and modify images based on text prompts and a condition image. It is a Latent Diffusion Model that uses a fixed, pretrained text encoder (CLIP ViT-L/14) as suggested in the Imagen paper.
Resources for more information: GitHub Repository, Paper.

Cite as:

@article{qin2023unicontrol,
  title={UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild},
  author={Qin, Can and Zhang, Shu and Yu, Ning and Feng, Yihao and Yang, Xinyi and Zhou, Yingbo and Wang, Huan and Niebles, Juan Carlos and Xiong, Caiming and Savarese, Silvio and others},
  journal={arXiv preprint arXiv:2305.11147},
  year={2023}
}