Birchlabs commited on
Commit
95328ee
·
1 Parent(s): 33eca3b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -5
README.md CHANGED
@@ -27,8 +27,11 @@ Diffusers' StableDiffusionXLPipeline convention handles text encoders + UNet + V
27
 
28
  ```python
29
  from diffusers import StableDiffusionXLPipeline, DPMSolverMultistepScheduler
 
30
  import torch
31
  from torch import Generator
 
 
32
 
33
  # scheduler args documented here:
34
  # https://github.com/huggingface/diffusers/blob/main/src/diffusers/schedulers/scheduling_dpmsolver_multistep.py#L98
@@ -43,6 +46,8 @@ scheduler: DPMSolverMultistepScheduler = DPMSolverMultistepScheduler.from_pretra
43
  use_karras_sigmas=True,
44
  )
45
 
 
 
46
  pipe: StableDiffusionXLPipeline = StableDiffusionXLPipeline.from_pretrained(
47
  'Birchlabs/waifu-diffusion-xl-unofficial',
48
  scheduler=scheduler,
@@ -50,12 +55,15 @@ pipe: StableDiffusionXLPipeline = StableDiffusionXLPipeline.from_pretrained(
50
  use_safetensors=True,
51
  variant='fp16'
52
  )
53
- pipe.to("cuda")
 
 
 
54
 
55
  prompt = 'masterpiece, best quality, 1girl, green hair, sweater, looking at viewer, upper body, beanie, outdoors, watercolor, night, turtleneck'
56
  negative_prompt = 'lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, artist name'
57
 
58
- images = pipe(
59
  prompt=prompt,
60
  negative_prompt=negative_prompt,
61
  num_inference_steps=25,
@@ -64,9 +72,13 @@ images = pipe(
64
  target_size=(1024, 1024),
65
  height=1024,
66
  width=1024,
67
- generator=Generator().manual_seed(45),
68
- ).images[0]
69
 
 
 
 
 
70
  ```
71
 
72
  ### UNet2DConditionModel
@@ -88,7 +100,20 @@ base_unet: UNet2DConditionModel = UNet2DConditionModel.from_pretrained(
88
 
89
  ## How it was converted
90
 
91
- I used Kohya's converter script. See [this commit](https://github.com/Birch-san/diffusers-play/commit/3f16355dd0064932d0bf356ed78676089b9e46ca), and my [previous explanation](https://huggingface.co/Birchlabs/wd-1-5-beta3-unofficial#how-wd15b3-compvis-checkpoint-was-converted) for a bit more detail on how I invoke such scripts.
 
 
 
 
 
 
 
 
 
 
 
 
 
92
 
93
  ### NOTE: The work here is a Work in Progress! Nothing in this repository is final.
94
 
 
27
 
28
  ```python
29
  from diffusers import StableDiffusionXLPipeline, DPMSolverMultistepScheduler
30
+ from diffusers.pipelines.stable_diffusion_xl import StableDiffusionXLPipelineOutput
31
  import torch
32
  from torch import Generator
33
+ from PIL import Image
34
+ from typing import List
35
 
36
  # scheduler args documented here:
37
  # https://github.com/huggingface/diffusers/blob/main/src/diffusers/schedulers/scheduling_dpmsolver_multistep.py#L98
 
46
  use_karras_sigmas=True,
47
  )
48
 
49
+ # pipeline args documented here:
50
+ # https://github.com/huggingface/diffusers/blob/95b7de88fd0dffef2533f1cbaf9ffd9d3c6d04c8/src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py#L548
51
  pipe: StableDiffusionXLPipeline = StableDiffusionXLPipeline.from_pretrained(
52
  'Birchlabs/waifu-diffusion-xl-unofficial',
53
  scheduler=scheduler,
 
55
  use_safetensors=True,
56
  variant='fp16'
57
  )
58
+ pipe.to('cuda')
59
+
60
+ # StableDiffusionXLPipeline is hardcoded to cast the VAE to float32, but Ollin's VAE works fine in float16
61
+ pipe.vae.to(torch.float16)
62
 
63
  prompt = 'masterpiece, best quality, 1girl, green hair, sweater, looking at viewer, upper body, beanie, outdoors, watercolor, night, turtleneck'
64
  negative_prompt = 'lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, artist name'
65
 
66
+ out: StableDiffusionXLPipelineOutput = pipe(
67
  prompt=prompt,
68
  negative_prompt=negative_prompt,
69
  num_inference_steps=25,
 
72
  target_size=(1024, 1024),
73
  height=1024,
74
  width=1024,
75
+ generator=Generator().manual_seed(48),
76
+ )
77
 
78
+ images: List[Image.Image] = out.images
79
+ img, *_ = images
80
+
81
+ img.save('waifu.png')
82
  ```
83
 
84
  ### UNet2DConditionModel
 
100
 
101
  ## How it was converted
102
 
103
+ I used Kohya's converter script, to convert the official (`hakurei/waifu-diffusion-xl`) [`wdxl-aesthetic-0.9.safetensors`](https://huggingface.co/hakurei/waifu-diffusion-xl/blob/main/wdxl-aesthetic-0.9.safetensors). See [this commit](https://github.com/Birch-san/diffusers-play/commit/3f16355dd0064932d0bf356ed78676089b9e46ca).
104
+
105
+ I forked [kohya's converter script](https://github.com/bmaltais/kohya_ss/blob/master/tools/convert_diffusers20_original_sd.py), making one [for SDXL](https://github.com/Birch-san/diffusers-play/blob/3f16355dd0064932d0bf356ed78676089b9e46ca/scripts/convert_diffusers20_original_sdxl.py).
106
+
107
+ I invoked it like so:
108
+
109
+ ```bash
110
+ python scripts/convert_diffusers20_original_sdxl.py \
111
+ --fp16 \
112
+ --use_safetensors \
113
+ --reference_model stabilityai/stable-diffusion-xl-base-0.9 \
114
+ in/wdxl-aesthetic-0.9.safetensors \
115
+ out/wdxl-diffusers
116
+ ```
117
 
118
  ### NOTE: The work here is a Work in Progress! Nothing in this repository is final.
119