Papers
arxiv:2306.00980

SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds

Published on Jun 1, 2023
· Featured in Daily Papers on Jun 2, 2023
Authors:
,
,
,
,
,
,
,

Abstract

Text-to-image diffusion models can create stunning images from natural language descriptions that rival the work of professional artists and photographers. However, these models are large, with complex network architectures and tens of denoising iterations, making them computationally expensive and slow to run. As a result, high-end GPUs and cloud-based inference are required to run diffusion models at scale. This is costly and has privacy implications, especially when user data is sent to a third party. To overcome these challenges, we present a generic approach that, for the first time, unlocks running text-to-image diffusion models on mobile devices in less than 2 seconds. We achieve so by introducing efficient network architecture and improving step distillation. Specifically, we propose an efficient UNet by identifying the redundancy of the original model and reducing the computation of the image decoder via data distillation. Further, we enhance the step distillation by exploring training strategies and introducing regularization from classifier-free guidance. Our extensive experiments on MS-COCO show that our model with 8 denoising steps achieves better FID and CLIP scores than Stable Diffusion v1.5 with 50 steps. Our work democratizes content creation by bringing powerful text-to-image diffusion models to the hands of users.

Community

的时代我的房东发

This comment has been hidden

create new vehicle

generate new image with supercar on cyberpunk city street

Create flying taxi

generate new image with supercar on cyberpunk city street realistic

设计一个11届职工运动会logo,要有奥运火炬和奔跑的人的元素

lol you guys this isnt where you put prompts

a dog on house

Fantastic building in Guangzhou city

This comment has been hidden

任你含,从何时发哥不要利群?

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2306.00980 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2306.00980 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2306.00980 in a Space README.md to link it from this page.

Collections including this paper 1