ywlee88 commited on
Commit
ad5f604
β€’
1 Parent(s): f9e966d

update README

Browse files
Files changed (1) hide show
  1. README.md +16 -3
README.md CHANGED
@@ -8,9 +8,6 @@ tags:
8
  <img src="https://dl.dropboxusercontent.com/scl/fi/yosvi68jvyarbvymxc4hm/github_logo.png?rlkey=r9ouwcd7cqxjbvio43q9b3djd&dl=1" width="1024px" />
9
  </div>
10
 
11
- > **[KOALA: Self-Attention Matters in Knowledge Distillation of Latent Diffusion Models for Memory-Efficient and Fast Image Synthesis](http://arxiv.org/abs/2312.04005)**<br>
12
- > [Youngwan Lee](https://github.com/youngwanLEE)<sup>1,2</sup>, [Kwanyong Park](https://pkyong95.github.io/)<sup>1</sup>, [Yoorhim Cho](https://ofzlo.github.io/)<sup>3</sup>, [Young-Ju Lee](https://scholar.google.com/citations?user=6goOQh8AAAAJ&hl=en)<sup>1</sup>, [Sung Ju Hwang](http://www.sungjuhwang.com/)<sup>2,4</sup> <br>
13
- > <sup>1</sup>ETRI <sup>2</sup>KAIST, <sup>3</sup>SMWU, <sup>4</sup>DeepAuto.ai
14
 
15
 
16
  <div style="display:flex;justify-content: center">
@@ -57,6 +54,20 @@ There are two two types of compressed U-Net, KOALA-1B and KOALA-700M, which are
57
  <img src="https://dl.dropboxusercontent.com/scl/fi/5ydeywgiyt1d3njw63dpk/arch.png?rlkey=1p6imbjs4lkmfpcxy153i1a2t&dl=1" width="1024px" />
58
  </div>
59
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
60
 
61
  ## Latency and memory usage comparison on different GPUs
62
 
@@ -85,6 +96,8 @@ We measure the inference time of SDM-v2.0 with 768x768 resolution and the other
85
  - Resources for more information: Check out [KOALA report on arXiv](https://arxiv.org/abs/2312.04005) and [project page](https://youngwanlee.github.io/KOALA/).
86
 
87
 
 
 
88
  ## Usage with πŸ€—[Diffusers library](https://github.com/huggingface/diffusers)
89
  The inference code with denoising step 25
90
  ```python
 
8
  <img src="https://dl.dropboxusercontent.com/scl/fi/yosvi68jvyarbvymxc4hm/github_logo.png?rlkey=r9ouwcd7cqxjbvio43q9b3djd&dl=1" width="1024px" />
9
  </div>
10
 
 
 
 
11
 
12
 
13
  <div style="display:flex;justify-content: center">
 
54
  <img src="https://dl.dropboxusercontent.com/scl/fi/5ydeywgiyt1d3njw63dpk/arch.png?rlkey=1p6imbjs4lkmfpcxy153i1a2t&dl=1" width="1024px" />
55
  </div>
56
 
57
+ ### U-Net comparison
58
+
59
+ | U-Net | SDM-v2.0 | SDXL-Base-1.0 | KOALA-1B | KOALA-700M |
60
+ |-------|----------|-----------|-----------|-------------|
61
+ | Param. | 865M | 2,567M | 1,161M | 782M |
62
+ | CKPT size | 3.46GB | 10.3GB | 4.4GB | 3.0GB |
63
+ | Tx blocks | [1, 1, 1, 1] | [0, 2, 10] | [0, 2, 6] | [0, 2, 5] |
64
+ | Mid block | βœ“ | βœ“ | βœ“ | βœ— |
65
+ | Latency | 1.131s | 3.133s | 1.604s | 1.257s |
66
+
67
+ - Tx menans transformer block and CKPT means the trained checkpoint file.
68
+ - We measured latency with FP16-precision, and 25 denoising steps in NVIDIA 4090 GPU (24GB).
69
+ - SDM-v2.0 uses 768x768 resolution, while SDXL and KOALA models uses 1024x1024 resolution.
70
+
71
 
72
  ## Latency and memory usage comparison on different GPUs
73
 
 
96
  - Resources for more information: Check out [KOALA report on arXiv](https://arxiv.org/abs/2312.04005) and [project page](https://youngwanlee.github.io/KOALA/).
97
 
98
 
99
+
100
+
101
  ## Usage with πŸ€—[Diffusers library](https://github.com/huggingface/diffusers)
102
  The inference code with denoising step 25
103
  ```python