Iceclear commited on
Commit
78e1da3
1 Parent(s): ec26602

Rename autoencoder

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -48,17 +48,17 @@ The model developer used the following dataset for training the model:
48
  **Training Procedure**
49
  StableSR is an image super-resolution model finetuned on [Stable Diffusion](https://github.com/Stability-AI/stablediffusion), further equipped with a time-aware encoder and a controllable feature wrapping (CFW) module.
50
 
51
- - Following Stable Diffusion, images are encoded through the fixed VQGAN encoder, which turns images into latent representations. The autoencoder uses a relative downsampling factor of 8 and maps images of shape H x W x 3 to latents of shape H/f x W/f x 4.
52
  - The latent representations are fed to the time-aware encoder as guidance.
53
  - The loss is the same as Stable Diffusion.
54
  - After finetuning the diffusion model, we further train the CFW module using the data generated by the finetuned diffusion model.
55
- - The VQGAN model is fixed and only CFW is trainable.
56
- - The loss is similar to training a VQGAN, except that we use a fixed adversarial loss weight of 0.025 rather than a self-adjustable one.
57
 
58
  We currently provide the following checkpoints:
59
 
60
  - [stablesr_000117.ckpt](https://huggingface.co/Iceclear/StableSR/resolve/main/stablesr_000117.ckpt): Diffusion model finetuned on [SD2.1-512base](https://huggingface.co/stabilityai/stable-diffusion-2-1-base) with DF2K_OST dataset for 117 epochs.
61
- - [vqgan_cfw_00011.ckpt](https://huggingface.co/Iceclear/StableSR/resolve/main/vqgan_cfw_00011.ckpt): CFW module with fixed VQGAN trained on synthetic paired data for 11 epochs.
62
  - [stablesr_768v_000139.ckpt](https://huggingface.co/Iceclear/StableSR/blob/main/stablesr_768v_000139.ckpt): Diffusion model finetuned on [SD2.1-768v](https://huggingface.co/stabilityai/stable-diffusion-2-1) with DF2K_OST dataset for 139 epochs.
63
 
64
  ## Evaluation Results
 
48
  **Training Procedure**
49
  StableSR is an image super-resolution model finetuned on [Stable Diffusion](https://github.com/Stability-AI/stablediffusion), further equipped with a time-aware encoder and a controllable feature wrapping (CFW) module.
50
 
51
+ - Following Stable Diffusion, images are encoded through the fixed autoencoder, which turns images into latent representations. The autoencoder uses a relative downsampling factor of 8 and maps images of shape H x W x 3 to latents of shape H/f x W/f x 4.
52
  - The latent representations are fed to the time-aware encoder as guidance.
53
  - The loss is the same as Stable Diffusion.
54
  - After finetuning the diffusion model, we further train the CFW module using the data generated by the finetuned diffusion model.
55
+ - The autoencoder model is fixed and only CFW is trainable.
56
+ - The loss is similar to training an autoencoder, except that we use a fixed adversarial loss weight of 0.025 rather than a self-adjustable one.
57
 
58
  We currently provide the following checkpoints:
59
 
60
  - [stablesr_000117.ckpt](https://huggingface.co/Iceclear/StableSR/resolve/main/stablesr_000117.ckpt): Diffusion model finetuned on [SD2.1-512base](https://huggingface.co/stabilityai/stable-diffusion-2-1-base) with DF2K_OST dataset for 117 epochs.
61
+ - [vqgan_cfw_00011.ckpt](https://huggingface.co/Iceclear/StableSR/resolve/main/vqgan_cfw_00011.ckpt): CFW module with fixed autoencoder trained on synthetic paired data for 11 epochs.
62
  - [stablesr_768v_000139.ckpt](https://huggingface.co/Iceclear/StableSR/blob/main/stablesr_768v_000139.ckpt): Diffusion model finetuned on [SD2.1-768v](https://huggingface.co/stabilityai/stable-diffusion-2-1) with DF2K_OST dataset for 139 epochs.
63
 
64
  ## Evaluation Results