ostris commited on
Commit
4e2f143
·
verified ·
1 Parent(s): 0b3f637

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -0
README.md CHANGED
@@ -1,3 +1,52 @@
1
  ---
2
  license: creativeml-openrail-m
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: creativeml-openrail-m
3
+ library_name: diffusers
4
+ pipeline_tag: text-to-image
5
  ---
6
+ # SD 1.5 Big G (alpha)
7
+
8
+ This is a Stable Diffusion 1.5 model, but it uses the [CLIP Big G](https://huggingface.co/laion/CLIP-ViT-bigG-14-laion2B-39B-b160k) text encoder instead of the original [CLIP-L](https://huggingface.co/openai/clip-vit-large-patch14) text encoder.
9
+ This is just a knowledge transfer pre-train with the goal of preserving the current knowledge of the model.
10
+ It was only trained using student/teacher training from my [SD 1.5 fine tune, Objective Reality v2](https://huggingface.co/ostris/objective-reality).
11
+ To fully realize the full potential of the much larger text encoder, it would need to be further fine tuned on a large dataset.
12
+
13
+ # Examples
14
+
15
+ Coming soon
16
+
17
+ # Usage
18
+
19
+ For diffusers, you can use it like any other stable diffusion model.
20
+
21
+ ```python
22
+ from diffusers import StableDiffusionPipeline
23
+ import torch
24
+
25
+ model_id = "ostris/sd15-big-g-alpha"
26
+ pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
27
+ pipe = pipe.to("cuda")
28
+
29
+ prompt = "a photo of an astronaut riding a horse on mars"
30
+ image = pipe(prompt).images[0]
31
+
32
+ image.save("astronaut_rides_horse.png")
33
+ ```
34
+
35
+ It will not work out of the box with Comfy UI or Auto1111. There would need to be special code to load it. If there is any interest in this model, I may work on compatibility.
36
+ Overall, it won't be hard to add. The only architecture change is the text encoder the and cross attention weights.
37
+
38
+ # Alpha
39
+
40
+ This is just a pretrained alpha. There are some concepts that did not seem to transfer. It really needs proper training on a large dataset. Anyone is welcome to take this task on. I do not plan to at the time.
41
+
42
+ # Why make this?
43
+
44
+ In the words of George Mallory, "Because it's there"
45
+
46
+ # Training Method
47
+
48
+ As mentioned above, it was trained using student/teacher only. This was an iterative process over the corse of a few months, and I did not keep track of all of the exact numbers. The following are best estimates.
49
+
50
+ The cross attention layers were trained for 1-2 million steps with a batch size of 8 on a single 4090 GPU. Then the full unet was trained for around 100k steps with the same settings.
51
+
52
+