Papers
arxiv:2311.09257

UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs

Published on Nov 14, 2023
· Featured in Daily Papers on Nov 17, 2023
Authors:
,

Abstract

Text-to-image diffusion models have demonstrated remarkable capabilities in transforming textual prompts into coherent images, yet the computational cost of their inference remains a persistent challenge. To address this issue, we present UFOGen, a novel generative model designed for ultra-fast, one-step text-to-image synthesis. In contrast to conventional approaches that focus on improving samplers or employing distillation techniques for diffusion models, UFOGen adopts a hybrid methodology, integrating diffusion models with a GAN objective. Leveraging a newly introduced diffusion-GAN objective and initialization with pre-trained diffusion models, UFOGen excels in efficiently generating high-quality images conditioned on textual descriptions in a single step. Beyond traditional text-to-image generation, UFOGen showcases versatility in applications. Notably, UFOGen stands among the pioneering models enabling one-step text-to-image generation and diverse downstream tasks, presenting a significant advancement in the landscape of efficient generative models. \blfootnote{*Work done as a student researcher of Google, dagger indicates equal contribution.

Community

The main figure presented at the paper
IMG_8362.jpeg

I think the LCM (and LCM-LoRA) results are much better than they have showcased here.

I think the LCM (and LCM-LoRA) results are much better than they have showcased here.

Sure it's capable of much better results at higher steps... but at 1 step or 2 as shown in the example?

I'd guess that they used the original LCM_Dreamshaper_v7 model with produces results like that at 2 and 4 steps.
The distilled LCM SDXL produces much better images at 2 and 4 steps than the examples

It's normal in a paper that uses Stable Diffusion outputs to select the worst outputs, even if they are bragging about the Stable Diffusion outputs.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2311.09257 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2311.09257 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2311.09257 in a Space README.md to link it from this page.

Collections including this paper 16