README.md · shi-labs/versatile-diffusion at 9b53c6bcb7a1d106540c2d6f8d0a7e9e1bfa5133

Versatile Diffusion (v1.0, four-flow)

We built Versatile Diffusion (VD), the first unified multi-flow multimodal diffusion framework, as a step towards Universal Generative AI. Versatile Diffusion can natively support image-to-text, image-variation, text-to-image, and text-variation, and can be further extended to other applications such as semantic-style disentanglement, image-text dual-guided generation, latent image-to-text-to-image editing, and more. Future versions will support more modalities such as speech, music, video and 3D.