Update README.md
Browse files
README.md
CHANGED
@@ -27,11 +27,21 @@ model_student.ckpt: The latent diffusion model checkpoint
|
|
27 |
|
28 |
#### How to use
|
29 |
|
30 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
31 |
|
32 |
#### Limitations and bias
|
33 |
|
34 |
-
|
|
|
|
|
35 |
|
36 |
## Training data
|
37 |
|
|
|
27 |
|
28 |
#### How to use
|
29 |
|
30 |
+
You need some dependancies from multiple repositories linked in this repository : [CLOOB latent diffusion](https://github.com/JD-P/cloob-latent-diffusion) :
|
31 |
+
|
32 |
+
* [CLIP](https://github.com/openai/CLIP/tree/40f5484c1c74edd83cb9cf687c6ab92b28d8b656)
|
33 |
+
* [CLOOB](https://github.com/crowsonkb/cloob-training/tree/136ca7dd69a03eeb6ad525da991d5d7083e44055) : the model to encode images and texts in an unified latent space, used for conditionning the latent diffusion.
|
34 |
+
* [Latent Diffusion](https://github.com/CompVis/latent-diffusion/tree/f13bf9bf463d95b5a16aeadd2b02abde31f769f8) : latent diffusion model definition
|
35 |
+
* [Taming transformers](https://github.com/CompVis/taming-transformers/tree/24268930bf1dce879235a7fddd0b2355b84d7ea6) : a pretrained convolutional VQGAN is used as an autoencoder to go from image space to the latent space in which the diffusion is done.
|
36 |
+
* [v-diffusion](https://github.com/crowsonkb/v-diffusion-pytorch/tree/ffabbb1a897541fa2a3d034f397c224489d97b39) : contains some functions for sampling using a diffusion model with text and/or image prompts.
|
37 |
+
|
38 |
+
An example code to use the model to sample images from a text prompt can be seen in a [Colab Notebook](https://colab.research.google.com/drive/1XGHdO8IAGajnpb-x4aOb-OMYfZf0WDTi?usp=sharing), or directly in the [app source code](https://huggingface.co/spaces/huggan/wikiart-diffusion-mini/blob/main/app.py) for the Gradio demo on [this Space](https://huggingface.co/spaces/huggan/wikiart-diffusion-mini)
|
39 |
|
40 |
#### Limitations and bias
|
41 |
|
42 |
+
The student latent diffusion model was trained only on images from the WikiArt dataset, but the VQGAN autoencoder, the CLOOB model and the teacher latent diffusion model all come from pretrained checkpoints and were trained on images and texts from the internet.
|
43 |
+
|
44 |
+
According to the [Latent Diffusion paper](https://arxiv.org/abs/2112.10752): “Deep learning modules tend to reproduce or exacerbate biases that are already present in the data”.
|
45 |
|
46 |
## Training data
|
47 |
|