Papers
arxiv:2311.16567

MobileDiffusion: Subsecond Text-to-Image Generation on Mobile Devices

Published on Nov 28, 2023
Authors:
,
,
,

Abstract

The deployment of large-scale text-to-image diffusion models on mobile devices is impeded by their substantial model size and slow inference speed. In this paper, we propose MobileDiffusion, a highly efficient text-to-image diffusion model obtained through extensive optimizations in both architecture and sampling techniques. We conduct a comprehensive examination of model architecture design to reduce redundancy, enhance computational efficiency, and minimize model's parameter count, while preserving image generation quality. Additionally, we employ distillation and diffusion-GAN finetuning techniques on MobileDiffusion to achieve 8-step and 1-step inference respectively. Empirical studies, conducted both quantitatively and qualitatively, demonstrate the effectiveness of our proposed techniques. MobileDiffusion achieves a remarkable sub-second inference speed for generating a 512times512 image on mobile devices, establishing a new state of the art.

Community

This comment has been hidden

Ikun

小桥流水人家

Printed scarves

How can I make a photo

This comment has been hidden
This comment has been hidden

Can't wait for this model or its code to never, ever get released 🤗

This comment has been hidden
This comment has been hidden

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2311.16567 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2311.16567 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2311.16567 in a Space README.md to link it from this page.

Collections including this paper 8