teticio commited on
Commit
01c4a98
1 Parent(s): 529c646

update readme

Browse files
Files changed (3) hide show
  1. README.md +8 -0
  2. notebooks/test_model.ipynb +0 -0
  3. scripts/train_vae.py +0 -3
README.md CHANGED
@@ -127,6 +127,13 @@ Rather than denoising images directly, it is interesting to work in the "latent
127
 
128
  At the time of writing, the Hugging Face `diffusers` library is geared towards inference and lacking in training functionality, rather like its cousin `transformers` in the early days of development. In order to train a VAE (Variational Autoencoder), I use the [stable-diffusion](https://github.com/CompVis/stable-diffusion) repo from CompVis and convert the checkpoints to `diffusers` format. Note that it uses a perceptual loss function for images; it would be nice to try a perceptual *audio* loss function.
129
 
 
 
 
 
 
 
 
130
  #### Train an autoencoder.
131
  ```bash
132
  python scripts/train_vae.py \
@@ -138,6 +145,7 @@ python scripts/train_vae.py \
138
  #### Train latent diffusion model.
139
  ```bash
140
  accelerate launch ...
 
141
  --vae models/autoencoder-kl
142
  --latent_resoultion 32
143
  ```
 
127
 
128
  At the time of writing, the Hugging Face `diffusers` library is geared towards inference and lacking in training functionality, rather like its cousin `transformers` in the early days of development. In order to train a VAE (Variational Autoencoder), I use the [stable-diffusion](https://github.com/CompVis/stable-diffusion) repo from CompVis and convert the checkpoints to `diffusers` format. Note that it uses a perceptual loss function for images; it would be nice to try a perceptual *audio* loss function.
129
 
130
+ #### Install dependencies to train with Stable Diffusion
131
+ ```
132
+ pip install omegaconf
133
+ pip install -e git+https://github.com/CompVis/stable-diffusion.git@main#egg=latent-diffusion
134
+ pip install -e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers
135
+ ```
136
+
137
  #### Train an autoencoder.
138
  ```bash
139
  python scripts/train_vae.py \
 
145
  #### Train latent diffusion model.
146
  ```bash
147
  accelerate launch ...
148
+ ...
149
  --vae models/autoencoder-kl
150
  --latent_resoultion 32
151
  ```
notebooks/test_model.ipynb CHANGED
The diff for this file is too large to render. See raw diff
 
scripts/train_vae.py CHANGED
@@ -1,6 +1,3 @@
1
- # pip install -e git+https://github.com/CompVis/stable-diffusion.git@master
2
- # pip install -e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers
3
-
4
  import os
5
  import argparse
6
 
 
 
 
 
1
  import os
2
  import argparse
3