Spaces:

nota-ai
/

compressed-stable-diffusion

Running

File size: 2,618 Bytes

ec6ef19
105640e
5f10bb6
a9b3bf8
15735b5
a9b3bf8
5c762ce
327e673
 
be169cd
105640e
 
f99066d
83803eb
c5dc4e3
 
36264f2
 
3e7e6b8
105640e
 
c5dc4e3
 
be169cd
 
829e355
 
 
803e17b
829e355
3e7e6b8
829e355
105640e
829e355
105640e

This demo showcases a lightweight Stable Diffusion model (SDM) for general-purpose text-to-image synthesis. Our model [**BK-SDM-Small**](https://huggingface.co/nota-ai/bk-sdm-small) achieves **36% reduced** parameters and latency. This model is bulit with (i) removing several residual and attention blocks from the U-Net of [SDM-v1.4](https://huggingface.co/CompVis/stable-diffusion-v1-4) and (ii) distillation pretraining on only 0.22M LAION pairs (fewer than 0.1% of the full training set). Despite very limited training resources, our model can imitate the original SDM by benefiting from transferred knowledge.
- **For more information & acknowledgments**, please see [Paper](https://arxiv.org/abs/2305.15798), [GitHub](https://github.com/Nota-NetsPresso/BK-SDM), BK-SDM-{[Base](https://huggingface.co/nota-ai/bk-sdm-base), [Small](https://huggingface.co/nota-ai/bk-sdm-small), [Tiny](https://huggingface.co/nota-ai/bk-sdm-tiny)} Model Card.
  
<center>
    <img alt="U-Net architectures and KD-based pretraining" img src="https://huggingface.co/spaces/nota-ai/compressed-stable-diffusion/resolve/91f349ab3b900cbfec20163edd6a312d1e8c8193/docs/fig_model.png" width="65%">
</center>

<br/>

- This research was accepted to [**ICCV 2023 Demo Track**](https://iccv2023.thecvf.com/demos-111.php) & [**ICML 2023 Workshop on Efficient Systems for Foundation Models** (ES-FoMo)](https://es-fomo.com/).
- Please be aware that your prompts are logged, _without_ any personally identifiable information.
- For different images with the same prompt, please change _Random Seed_ in Advanced Settings (because of using the firstly sampled latent code per seed).

<br/>
**Demo Environment**: [Oct/08/2023] Free CPU-basic (2 vCPU · 16 GB RAM) — 7~10 min slow inference (for a 512×512 image with 25 denoising steps)
<br/>

- 🙏 Better to use [Hosted inference API in Our Model Card](https://huggingface.co/nota-ai/bk-sdm-small)

<details>
<summary>Previous Env Setup:</summary>
[Oct/01/2023] NVIDIA T4-small (4 vCPU · 15 GB RAM · 16GB VRAM) — 5~10 sec inference.
<br/>
[Sept/01/2023] Free CPU-basic (2 vCPU · 16 GB RAM) — 7~10 min slow inference.
<br/>
[Aug/01/2023] NVIDIA T4-small (4 vCPU · 15 GB RAM · 16GB VRAM) — 5~10 sec inference.
<br/>
[July/31/2023] Free CPU-basic (2 vCPU · 16 GB RAM) — 7~10 min slow inference.
<br/>
[July/27/2023] NVIDIA T4-small (4 vCPU · 15 GB RAM · 16GB VRAM) — 5~10 sec inference.
<br/>
[June/30/2023] Free CPU-basic (2 vCPU · 16 GB RAM) — 7~10 min slow inference.
<br/>
[May/31/2023] NVIDIA T4-small (4 vCPU · 15 GB RAM · 16GB VRAM) — 5~10 sec inference.
</details>