cq commited on
Commit
1eb13a0
1 Parent(s): 63dbe71

diffusion-deploy

Browse files
Files changed (1) hide show
  1. README.md +5 -3
README.md CHANGED
@@ -14,8 +14,10 @@ pipeline_tag: text-to-image
14
  ---
15
 
16
  # Small Stable Diffusion Model Card
 
 
17
 
18
- Similar image generation quality, but nearly 1/2 smaller!
19
  Here are some samples:
20
  ![Samples](https://huggingface.co/OFA-Sys/small-stable-diffusion-v0/resolve/main/sample_images_compressed.jpg)
21
 
@@ -52,13 +54,13 @@ This model is initialized from stable-diffusion v1-4. As the model structure is
52
 
53
  ### Training Procedure
54
 
55
- After the initialization, the model have been trained for 1100k steps in 8xA100 GPUS. The training progress consists of three stages. The first stage is a simple pre-training precedure. In the last two stages, the original stable diffusion was utilized to distill knowledge to small model as a teacher model. In all stages, only the parameters in unet were trained and other parameters were frozen.
56
 
57
 
58
  - **Hardware:** 8 x A100-80GB GPUs
59
  - **Optimizer:** AdamW
60
 
61
- - **Stage 1** - Pretrain the unet part of model.
62
  - **Steps**: 500,000
63
  - **Batch:** batch size=8, GPUs=8, Gradient Accumulations=2. Total batch size=128
64
  - **Learning rate:** warmup to 1e-5 for 10,000 steps and then kept constant
14
  ---
15
 
16
  # Small Stable Diffusion Model Card
17
+ 【Update 2023/02/07】 Recently, we have released [a diffusion deployment repo](https://github.com/OFA-Sys/diffusion-deploy) to speedup the inference on both GPU (\~4x speedup, based on TensorRT) and CPU (\~12x speedup, based on IntelOpenVINO).
18
+ Integrated with this repo, small-stable-diffusion could generate images in just **5 seconds on the CPU**.
19
 
20
+ Similar image generation quality, but is nearly 1/2 smaller!
21
  Here are some samples:
22
  ![Samples](https://huggingface.co/OFA-Sys/small-stable-diffusion-v0/resolve/main/sample_images_compressed.jpg)
23
 
54
 
55
  ### Training Procedure
56
 
57
+ After the initialization, the model has been trained for 1100k steps in 8xA100 GPUS. The training progress consists of three stages. The first stage is a simple pre-training precedure. In the last two stages, the original stable diffusion was utilized to distill knowledge to small model as a teacher model. In all stages, only the parameters in unet were trained and other parameters were frozen.
58
 
59
 
60
  - **Hardware:** 8 x A100-80GB GPUs
61
  - **Optimizer:** AdamW
62
 
63
+ - **Stage 1** - Pretrain the unet part of the model.
64
  - **Steps**: 500,000
65
  - **Batch:** batch size=8, GPUs=8, Gradient Accumulations=2. Total batch size=128
66
  - **Learning rate:** warmup to 1e-5 for 10,000 steps and then kept constant