Diffusers documentation

Diffusers

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v0.14.0).
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started



Diffusers

🤗 Diffusers is the go-to library for state-of-the-art pretrained diffusion models for generating images, audio, and even 3D structures of molecules. Whether you’re looking for a simple inference solution or want to train your own diffusion model, 🤗 Diffusers is a modular toolbox that supports both. Our library is designed with a focus on usability over performance, simple over easy, and customizability over abstractions.

The library has three main components:

  • State-of-the-art diffusion pipelines for inference with just a few lines of code.
  • Interchangeable noise schedulers for balancing trade-offs between generation speed and quality.
  • Pretrained models that can be used as building blocks, and combined with schedulers, for creating your own end-to-end diffusion systems.

Supported pipelines

Pipeline Paper/Repository Tasks
alt_diffusion AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities Image-to-Image Text-Guided Generation
audio_diffusion Audio Diffusion Unconditional Audio Generation
controlnet Adding Conditional Control to Text-to-Image Diffusion Models Image-to-Image Text-Guided Generation
cycle_diffusion Unifying Diffusion Models’ Latent Space, with Applications to CycleDiffusion and Guidance Image-to-Image Text-Guided Generation
dance_diffusion Dance Diffusion Unconditional Audio Generation
ddpm Denoising Diffusion Probabilistic Models Unconditional Image Generation
ddim Denoising Diffusion Implicit Models Unconditional Image Generation
latent_diffusion High-Resolution Image Synthesis with Latent Diffusion Models Text-to-Image Generation
latent_diffusion High-Resolution Image Synthesis with Latent Diffusion Models Super Resolution Image-to-Image
latent_diffusion_uncond High-Resolution Image Synthesis with Latent Diffusion Models Unconditional Image Generation
paint_by_example Paint by Example: Exemplar-based Image Editing with Diffusion Models Image-Guided Image Inpainting
pndm Pseudo Numerical Methods for Diffusion Models on Manifolds Unconditional Image Generation
score_sde_ve Score-Based Generative Modeling through Stochastic Differential Equations Unconditional Image Generation
score_sde_vp Score-Based Generative Modeling through Stochastic Differential Equations Unconditional Image Generation
semantic_stable_diffusion Semantic Guidance Text-Guided Generation
stable_diffusion_text2img Stable Diffusion Text-to-Image Generation
stable_diffusion_img2img Stable Diffusion Image-to-Image Text-Guided Generation
stable_diffusion_inpaint Stable Diffusion Text-Guided Image Inpainting
stable_diffusion_panorama MultiDiffusion Text-to-Panorama Generation
stable_diffusion_pix2pix InstructPix2Pix: Learning to Follow Image Editing Instructions Text-Guided Image Editing
stable_diffusion_pix2pix_zero Zero-shot Image-to-Image Translation Text-Guided Image Editing
stable_diffusion_attend_and_excite Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models Text-to-Image Generation
stable_diffusion_self_attention_guidance Improving Sample Quality of Diffusion Models Using Self-Attention Guidance Text-to-Image Generation
stable_diffusion_image_variation Stable Diffusion Image Variations Image-to-Image Generation
stable_diffusion_latent_upscale Stable Diffusion Latent Upscaler Text-Guided Super Resolution Image-to-Image
stable_diffusion_model_editing Editing Implicit Assumptions in Text-to-Image Diffusion Models Text-to-Image Model Editing
stable_diffusion_2 Stable Diffusion 2 Text-to-Image Generation
stable_diffusion_2 Stable Diffusion 2 Text-Guided Image Inpainting
stable_diffusion_2 Depth-Conditional Stable Diffusion Depth-to-Image Generation
stable_diffusion_2 Stable Diffusion 2 Text-Guided Super Resolution Image-to-Image
stable_diffusion_safe Safe Stable Diffusion Text-Guided Generation
stable_unclip Stable unCLIP Text-to-Image Generation
stable_unclip Stable unCLIP Image-to-Image Text-Guided Generation
stochastic_karras_ve Elucidating the Design Space of Diffusion-Based Generative Models Unconditional Image Generation
text_to_video_sd Modelscope’s Text-to-video-synthesis Model in Open Domain Text-to-Video Generation
unclip Hierarchical Text-Conditional Image Generation with CLIP Latents(implementation by kakaobrain) Text-to-Image Generation
versatile_diffusion Versatile Diffusion: Text, Images and Variations All in One Diffusion Model Text-to-Image Generation
versatile_diffusion Versatile Diffusion: Text, Images and Variations All in One Diffusion Model Image Variations Generation
versatile_diffusion Versatile Diffusion: Text, Images and Variations All in One Diffusion Model Dual Image and Text Guided Generation
vq_diffusion Vector Quantized Diffusion Model for Text-to-Image Synthesis Text-to-Image Generation