madebyollin commited on
Commit
cf4db73
1 Parent(s): 81a553d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -19
README.md CHANGED
@@ -16,27 +16,42 @@ library_name: diffusers
16
  | ![](example_baseline.png) | ![](example_finetuned.png) |
17
  | ![](example_baseline_closeup.png) | ![](example_finetuned_closeup.png) |
18
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
  ## 🧨 Diffusers Usage
21
 
22
  ⚠️ As of 2024-02-17, Stable Cascade's [PR](https://github.com/huggingface/diffusers/pull/6487) is still under review.
23
- I've only confirmed Stable Cascade working with this particular version of the PR:
 
24
  ```bash
25
  pip install --upgrade --force-reinstall https://github.com/kashif/diffusers/archive/a3dc21385b7386beb3dab3a9845962ede6765887.zip
26
  ```
27
 
 
 
28
  ```py
29
  import torch
 
30
 
31
  # Load the Stage-A-ft-HQ model
32
  from diffusers.pipelines.wuerstchen import PaellaVQModel
33
- stage_a_ft_hq = PaellaVQModel.from_pretrained("madebyollin/stage_a_ft_hq", torch_dtype=torch.float16)
34
 
35
  # Load the normal Stable Cascade pipeline
36
  from diffusers import StableCascadeDecoderPipeline, StableCascadePriorPipeline
37
 
38
- device = "cuda"
39
- num_images_per_prompt = 2
40
 
41
  prior = StableCascadePriorPipeline.from_pretrained("stabilityai/stable-cascade-prior", torch_dtype=torch.bfloat16).to(device)
42
  decoder = StableCascadeDecoderPipeline.from_pretrained("stabilityai/stable-cascade", torch_dtype=torch.float16).to(device)
@@ -62,21 +77,8 @@ decoder_output = decoder(
62
  negative_prompt=negative_prompt,
63
  guidance_scale=0.0,
64
  output_type="pil",
65
- num_inference_steps=10
66
  ).images
67
 
68
  display(decoder_output[0])
69
- ```
70
-
71
- ## Explanation
72
-
73
- Image generators like Würstchen and Stable Cascade create images via a multi-stage process.
74
- Stage A is the ultimate stage, responsible for rendering out full-resolution, human-interpretable images (based on the output from prior stages).
75
-
76
- The original Stage A tends to render slightly-smoothed-out images with a distinctive noise pattern on top.
77
-
78
- `stage-a-ft-hq` was finetuned briefly on a high-quality dataset in order to reduce these artifacts.
79
-
80
- ## Suggested Settings
81
-
82
- To generate highly detailed images, you probably want to use `stage-a-ft-hq` (which improves very fine detail) in combination with a large Stage B step count (which [improves mid-level detail](https://old.reddit.com/r/StableDiffusion/comments/1ar359h/cascade_can_generate_directly_at_1536x1536_and/kqhjtk5/)).
 
16
  | ![](example_baseline.png) | ![](example_finetuned.png) |
17
  | ![](example_baseline_closeup.png) | ![](example_finetuned_closeup.png) |
18
 
19
+ ## Explanation
20
+
21
+ Image generators like Würstchen and Stable Cascade create images via a multi-stage process.
22
+ Stage A is the ultimate stage, responsible for rendering out full-resolution, human-interpretable images (based on the output from prior stages).
23
+
24
+ The original Stage A tends to render slightly-smoothed-out images with a distinctive noise pattern on top.
25
+
26
+ `stage-a-ft-hq` was finetuned briefly on a high-quality dataset in order to reduce these artifacts.
27
+
28
+ ## Suggested Settings
29
+
30
+ To generate highly detailed images, you probably want to use `stage-a-ft-hq` (which improves very fine detail) in combination with a large Stage B step count (which [improves mid-level detail](https://old.reddit.com/r/StableDiffusion/comments/1ar359h/cascade_can_generate_directly_at_1536x1536_and/kqhjtk5/)).
31
 
32
  ## 🧨 Diffusers Usage
33
 
34
  ⚠️ As of 2024-02-17, Stable Cascade's [PR](https://github.com/huggingface/diffusers/pull/6487) is still under review.
35
+ I've only tested Stable Cascade with this particular version of the PR:
36
+
37
  ```bash
38
  pip install --upgrade --force-reinstall https://github.com/kashif/diffusers/archive/a3dc21385b7386beb3dab3a9845962ede6765887.zip
39
  ```
40
 
41
+ TODO: verify this particular sample code works
42
+
43
  ```py
44
  import torch
45
+ device = "cuda"
46
 
47
  # Load the Stage-A-ft-HQ model
48
  from diffusers.pipelines.wuerstchen import PaellaVQModel
49
+ stage_a_ft_hq = PaellaVQModel.from_pretrained("madebyollin/stage-a-ft-hq", torch_dtype=torch.float16).to(device)
50
 
51
  # Load the normal Stable Cascade pipeline
52
  from diffusers import StableCascadeDecoderPipeline, StableCascadePriorPipeline
53
 
54
+ num_images_per_prompt = 1
 
55
 
56
  prior = StableCascadePriorPipeline.from_pretrained("stabilityai/stable-cascade-prior", torch_dtype=torch.bfloat16).to(device)
57
  decoder = StableCascadeDecoderPipeline.from_pretrained("stabilityai/stable-cascade", torch_dtype=torch.float16).to(device)
 
77
  negative_prompt=negative_prompt,
78
  guidance_scale=0.0,
79
  output_type="pil",
80
+ num_inference_steps=20
81
  ).images
82
 
83
  display(decoder_output[0])
84
+ ```