rromb commited on
Commit
26c72b7
1 Parent(s): 3deeafc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -6
README.md CHANGED
@@ -8,23 +8,25 @@ license: creativeml-openrail-m
8
 
9
  ![pipeline](pipeline.png)
10
 
11
- SDXL consists of a mixture-of-experts pipeline for latent diffusion:
12
  In a first step, the base model is used to generate (noisy) latents,
13
- which are then further processed with a refinement model (available here: TODO) specialized for the final denoising steps.
14
  Note that the base model can be used as a standalone module.
15
 
16
- Alternatively, we can use a two-step pipeline as follows:
17
  First, the base model is used to generate latents of the desired output size.
18
  In the second step, we use a specialized high-resolution model and apply a technique called SDEdit (https://arxiv.org/abs/2108.01073, also known as "img2img")
19
- to the latents generated in the first step, using the same prompt. Note that this technique is slightly slower than the first one, as it requires more function evaluations.
 
 
20
 
21
  ### Model Description
22
 
23
  - **Developed by:** Stability AI
24
  - **Model type:** Diffusion-based text-to-image generative model
25
- - **License:** [OpenRAIL-M CreativeML](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/LICENSE.md)
26
  - **Model Description:** This is a model that can be used to generate and modify images based on text prompts. It is a [Latent Diffusion Model](https://arxiv.org/abs/2112.10752) that uses two fixed, pretrained text encoders ([OpenCLIP-ViT/G](https://github.com/mlfoundations/open_clip) and [CLIP-ViT/L](https://github.com/openai/CLIP/tree/main)).
27
- - **Resources for more information:** [GitHub Repository](https://github.com/Stability-AI/generative-models) [SDXL paper on arXiv](https://arxiv.org/abs/2307.01952).
28
 
29
  ### Model Sources
30
 
 
8
 
9
  ![pipeline](pipeline.png)
10
 
11
+ [SDXL](https://arxiv.org/abs/2307.01952) consists of a mixture-of-experts pipeline for latent diffusion:
12
  In a first step, the base model is used to generate (noisy) latents,
13
+ which are then further processed with a refinement model (available here: https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0/) specialized for the final denoising steps.
14
  Note that the base model can be used as a standalone module.
15
 
16
+ Alternatively, we can use a two-stage pipeline as follows:
17
  First, the base model is used to generate latents of the desired output size.
18
  In the second step, we use a specialized high-resolution model and apply a technique called SDEdit (https://arxiv.org/abs/2108.01073, also known as "img2img")
19
+ to the latents generated in the first step, using the same prompt. This technique is slightly slower than the first one, as it requires more function evaluations.
20
+
21
+ Source code is available at https://github.com/Stability-AI/generative-models .
22
 
23
  ### Model Description
24
 
25
  - **Developed by:** Stability AI
26
  - **Model type:** Diffusion-based text-to-image generative model
27
+ - **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/LICENSE.md)
28
  - **Model Description:** This is a model that can be used to generate and modify images based on text prompts. It is a [Latent Diffusion Model](https://arxiv.org/abs/2112.10752) that uses two fixed, pretrained text encoders ([OpenCLIP-ViT/G](https://github.com/mlfoundations/open_clip) and [CLIP-ViT/L](https://github.com/openai/CLIP/tree/main)).
29
+ - **Resources for more information:** Check out our [GitHub Repository](https://github.com/Stability-AI/generative-models) and the [SDXL report on arXiv](https://arxiv.org/abs/2307.01952).
30
 
31
  ### Model Sources
32