Papers
arxiv:2311.09257

UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs

Published on Nov 14, 2023
· Submitted by akhaliq on Nov 17, 2023
#2 Paper of the day
Authors:
,

Abstract

Text-to-image diffusion models have demonstrated remarkable capabilities in transforming textual prompts into coherent images, yet the computational cost of their inference remains a persistent challenge. To address this issue, we present UFOGen, a novel generative model designed for ultra-fast, one-step text-to-image synthesis. In contrast to conventional approaches that focus on improving samplers or employing distillation techniques for diffusion models, UFOGen adopts a hybrid methodology, integrating diffusion models with a GAN objective. Leveraging a newly introduced diffusion-GAN objective and initialization with pre-trained diffusion models, UFOGen excels in efficiently generating high-quality images conditioned on textual descriptions in a single step. Beyond traditional text-to-image generation, UFOGen showcases versatility in applications. Notably, UFOGen stands among the pioneering models enabling one-step text-to-image generation and diverse downstream tasks, presenting a significant advancement in the landscape of efficient generative models. \blfootnote{*Work done as a student researcher of Google, dagger indicates equal contribution.

Community

The main figure presented at the paper
IMG_8362.jpeg

I think the LCM (and LCM-LoRA) results are much better than they have showcased here.

I think the LCM (and LCM-LoRA) results are much better than they have showcased here.

Sure it's capable of much better results at higher steps... but at 1 step or 2 as shown in the example?

I'd guess that they used the original LCM_Dreamshaper_v7 model with produces results like that at 2 and 4 steps.
The distilled LCM SDXL produces much better images at 2 and 4 steps than the examples

It's normal in a paper that uses Stable Diffusion outputs to select the worst outputs, even if they are bragging about the Stable Diffusion outputs.

UFOGen: Revolutionizing Text-to-Image Generation with One-Step Diffusion GANs

Links 🔗:

👉 Subscribe: https://www.youtube.com/@Arxflix
👉 Twitter: https://x.com/arxflix
👉 LMNT (Partner): https://lmnt.com/

By Arxflix
9t4iCUHx_400x400-1.jpg

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2311.09257 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2311.09257 in a Space README.md to link it from this page.

Collections including this paper 16