Bagheera Bghira commited on
Commit
6a094b1
1 Parent(s): f365259

24150 steps: mj-60, text-1mp, anatomy, cinemamix, photo-aesthetics, shutterstock, sports, yoga, n1024

Browse files
README.md CHANGED
@@ -1,3 +1,325 @@
1
  ---
2
- license: openrail
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: creativeml-openrail-m
3
+ tags:
4
+ - stable-diffusion
5
+ - stable-diffusion-2-1
6
+ - text-to-image
7
+ pinned: true
8
+ library_name: diffusers
9
  ---
10
+
11
+ # Model Card for pseudo-flex-base (1024x1024 base resolution)
12
+ ![img](assets/banner.png)
13
+ <!-- Provide a quick summary of what the model is/does. [Optional] -->
14
+ stable-diffusion-2-1 (stabilityai/stable-diffusion-2-1) finetuned with different aspect ratios, into a photography model (ptx0/pseudo-real-beta).
15
+
16
+ ## Sample images
17
+
18
+ **Seed**: 2695929547
19
+
20
+ **Steps**: 25
21
+
22
+ **Sampler**: DDIM, default model config settings
23
+
24
+ **Version**: Pytorch 2.0.1, Diffusers 0.17.1
25
+
26
+ **Guidance**: 9.2
27
+
28
+ **Guidance rescale**: 0.0
29
+
30
+ | resolution | model | stable diffusion | pseudo-flex | realism-engine |
31
+ |:---------------:|:-------:|:------------------------------:|:-------------------------------:|:---------------------------------:
32
+ | 753x1004 (4:3) | v2-1 | ![img](assets/fam-base.png) | ![img](assets/fam-flex.png) | ![img](assets/fam-realism.png) |
33
+ | 1280x720 (16:9) | v2-1 | ![img](assets/ellen-base.png) | ![img](assets/ellen-flex.png) | ![img](assets/ellen-realism.png) |
34
+ | 1024x1024 (1:1) | v2-1 | ![img](assets/woman-base.png) | ![img](assets/woman-flex.png) | ![img](assets/woman-realism.png) |
35
+ | 1024x1024 (1:1) | v2-1 | ![img](assets/dark-base.png) | ![img](assets/dark-flex.png) | ![img](assets/dark-realism.png) |
36
+
37
+
38
+ ## Background
39
+
40
+ The `ptx0/pseudo-real-beta` pretrained checkpoint had its unet trained for 4,200 steps and its text encoder trained for 15,600 steps at a batch size of 15 with 10 gradient accumulations, on a diverse dataset:
41
+
42
+ * cushman (8000 kodachrome slides from 1939 to 1969)
43
+ * midjourney v5.1-filtered (about 22,000 upscaled v5.1 images)
44
+ * national geographic (about 3-4,000 >1024x768 images of animals, wildlife, landscapes, history)
45
+ * a small dataset of stock images of people vaping / smoking
46
+
47
+ It has a diverse capability of photorealistic and adventure with strong prompt coherence. However, it lacks multi-aspect capability.
48
+
49
+ The code used to train `pseudo-real-beta` did not have aspect bucketing support. I discovered `pseudo-flex-base` by @ttj, which supported theories I had.
50
+
51
+ ## Training code
52
+
53
+ I added thorough aspect bucketing support to my training loop dataloader by having it throw away any image under 1024x1024, and condition all images so that the smaller side of the image is 1024. The aspect ratio of the image is used to determine the new length of the other dimension, eg. used as a multiple for landscape or a divisor for portrait mode.
54
+
55
+ All batches have image of the same resolution. Different resolutions at the same aspect are all conditioned to 1024x... or ...x1024. A 1920x1080 image becomes approx 1820x1024.
56
+
57
+ ## Starting checkpoint
58
+
59
+ This model, `pseudo-flex-base` was created by fine-tuning the base `stabilityai/stable-diffusion-2-1` 768 model on its frozen text encoder, for 1000 steps on 148,000 images from LAION HD using the TEXT field as their caption.
60
+
61
+ The batch size was effectively 150 again. Batch size of 15 with 10 accumulations. This is very slow at very high resolutions, an aspect ratio of 1.5-1.7 will cause this to take about 700 seconds per iter on an A100 80G.
62
+
63
+ This training took two days.
64
+
65
+ ## Text encoder swap
66
+
67
+ At 1000 steps, the text encoder from `ptx0/pseudo-real-beta` was used experimentally with this model's unet in an attempt to resolve some residual image noise, eg. pixelation. That worked!
68
+
69
+ The training was restarted from ckpt 1000 with this text encoder.
70
+
71
+ ## The beginnings of wide / portrait aspect appearing
72
+
73
+ Validation prompts began to "pull together" from 1300 to 2950 steps. Some checkpoints show regression, but these usually resolve in about 100 steps. Improvements were always present, despite regresions.
74
+
75
+ ## Degradation and dataset swap
76
+
77
+ As training has been going on for some time now on 148,000 images at a batch size of 150 over 3000 steps, images began to degrade. This is presumably due to having completed 3 repeats on all images in the set, and that's IF all images in the set had been used. Considering some of the image filters discarded about 50,000 images, we landed at 9 repeats per image on our super low learning rate.
78
+
79
+ This caused two issues:
80
+
81
+ * The images were beginning to show static noise.
82
+ * The training was taking a very long time, and each checkpoint showed little improvement.
83
+ * Overfitting to prompt vocabulary, and a lack of generalization.
84
+
85
+ Ergo, at 1300 steps, the decision was made to cease training on the original LAION HD dataset, and instead, train on a *new* freshly-retrieved subset of high-resolution Midjourney v5.1 data.
86
+
87
+ This consisted of 17,800 images at a base resolution of 1024x1024, with about 700 samples in portrait and 700 samples in landscape.
88
+
89
+ ## Contrast issues
90
+
91
+ As the checkpoint 3275 was tested, a common observation was that darker images were washed out, and brighter images seemed "meh".
92
+
93
+ Various CFG rescale and guidance levels were tested, with the best dark images occurring around `guidance_scale=9.2` and `guidance_rescale=0.0` but they remained "washed out".
94
+
95
+ ## Dataset change number two
96
+
97
+ A new LAION subset was prepared with unique images and no square images - just a limited collection of aspect ratios:
98
+
99
+ * 16:9
100
+ * 9:16
101
+ * 2:3
102
+ * 3:2
103
+
104
+ This was intended to speed up the understanding of the model, and prevent overfitting on captions.
105
+
106
+ This LAION subset contained 17,800 images, evenly distributed through aspect ratios.
107
+
108
+ The images were then captioned using T5 Flan with BLIP2, to obtain highly accurate results.
109
+
110
+ ## Contrast fix: offset noise / SNR gamma to the rescue?
111
+
112
+ Offset noise and SNR gamma were applied experimentally to the checkpoint **4250**:
113
+
114
+ * `snr_gamma=5.0`
115
+ * `noise_offset=0.2`
116
+ * `noise_pertubation=0.1`
117
+
118
+ Within 25 steps of training, the contrast was back, and the prompt `a solid black square` once again produced a reasonable result.
119
+
120
+ At 50 steps of offset noise, things really seemed to "click" and `a solid black square` had the fewest deformities I've seen.
121
+
122
+ Step 75 checkpoint was broken. The SNR gamma math results in numeric instability and was disabled. The offset noise parameters were untouched.
123
+
124
+ ## Success! Improvement in quality and contrast.
125
+
126
+ Similar to the text encoder swap, the images showed a marked improvement over the next several checkpoints.
127
+
128
+ It was left to its own devices, and at step 4475, enough improvement was observed that another revision in this repository was created.
129
+
130
+
131
+ # Status: Test release
132
+
133
+ This model has been packaged up in a test form so that it can be thoroughly assessed by users.
134
+
135
+ For usage, see - [How to Get Started with the Model](#how-to-get-started-with-the-model)
136
+
137
+ ### It aims to solve the following issues:
138
+
139
+ 1. Generated images looks like they are cropped from a larger image.
140
+
141
+ 2. Generating non-square images creates weird results, due to the model being trained on square images.
142
+
143
+
144
+ ### Limitations:
145
+ 1. It's trained on a small dataset, so its improvements may be limited.
146
+ 2. The model architecture of SD 2.1 is older than SDXL, and will not generate comparably good results.
147
+
148
+ For 1:1 aspect ratio, it's fine-tuned at 1024x1024, although `ptx0/pseudo-real-beta` that it was based on, was last finetuned at 768x768.
149
+
150
+ ### Potential improvements:
151
+ 1. Train on a captioned dataset. This model used the TEXT field from LAION for convenience, though COCO-generated captions would be superior.
152
+ 2. Train the text encoder on large images.
153
+ 3. Periodic caption drop-out enforced to help condition classifier-free guidance capabilities.
154
+
155
+
156
+ # Table of Contents
157
+
158
+ - [Model Card for pseudo-flex-base](#model-card-for--model_id-)
159
+ - [Table of Contents](#table-of-contents)
160
+ - [Table of Contents](#table-of-contents-1)
161
+ - [Model Details](#model-details)
162
+ - [Model Description](#model-description)
163
+ - [Uses](#uses)
164
+ - [Direct Use](#direct-use)
165
+ - [Downstream Use [Optional]](#downstream-use-optional)
166
+ - [Out-of-Scope Use](#out-of-scope-use)
167
+ - [Bias, Risks, and Limitations](#bias-risks-and-limitations)
168
+ - [Recommendations](#recommendations)
169
+ - [Training Details](#training-details)
170
+ - [Training Data](#training-data)
171
+ - [Training Procedure](#training-procedure)
172
+ - [Preprocessing](#preprocessing)
173
+ - [Speeds, Sizes, Times](#speeds-sizes-times)
174
+ - [Evaluation](#evaluation)
175
+ - [Testing Data, Factors & Metrics](#testing-data-factors--metrics)
176
+ - [Testing Data](#testing-data)
177
+ - [Factors](#factors)
178
+ - [Metrics](#metrics)
179
+ - [Results](#results)
180
+ - [Model Examination](#model-examination)
181
+ - [Environmental Impact](#environmental-impact)
182
+ - [Technical Specifications [optional]](#technical-specifications-optional)
183
+ - [Model Architecture and Objective](#model-architecture-and-objective)
184
+ - [Compute Infrastructure](#compute-infrastructure)
185
+ - [Hardware](#hardware)
186
+ - [Software](#software)
187
+ - [Citation](#citation)
188
+ - [Glossary [optional]](#glossary-optional)
189
+ - [More Information [optional]](#more-information-optional)
190
+ - [Model Card Authors [optional]](#model-card-authors-optional)
191
+ - [Model Card Contact](#model-card-contact)
192
+ - [How to Get Started with the Model](#how-to-get-started-with-the-model)
193
+
194
+
195
+ # Model Details
196
+
197
+ ## Model Description
198
+
199
+ <!-- Provide a longer summary of what this model is/does. -->
200
+ stable-diffusion-2-1 (stabilityai/stable-diffusion-2-1 and ptx0/pseudo-real-beta) finetuned for dynamic aspect ratios.
201
+
202
+ finetuned resolutions:
203
+ | | width | height | aspect ratio | images |
204
+ |---:|--------:|---------:|:--------------|-------:|
205
+ | 0 | 1024 | 1024 | 1:1 | 90561 |
206
+ | 1 | 1536 | 1024 | 3:2 | 8716 |
207
+ | 2 | 1365 | 1024 | 4:3 | 6933 |
208
+ | 3 | 1468 | 1024 | ~3:2 | 113 |
209
+ | 4 | 1778 | 1024 | ~5:3 | 6315 |
210
+ | 5 | 1200 | 1024 | ~5:4 | 6376 |
211
+ | 6 | 1333 | 1024 | ~4:3 | 2814 |
212
+ | 7 | 1281 | 1024 | ~5:4 | 52 |
213
+ | 8 | 1504 | 1024 | ~3:2 | 139 |
214
+ | 9 | 1479 | 1024 | ~3:2 | 25 |
215
+ | 10 | 1384 | 1024 | ~4:3 | 1676 |
216
+ | 11 | 1370 | 1024 | ~4:3 | 63 |
217
+ | 12 | 1499 | 1024 | ~3:2 | 436 |
218
+ | 13 | 1376 | 1024 | ~4:3 | 68 |
219
+
220
+ Other aspects were in smaller buckets. It could have been done more succinctly or carefully, but careless handling of the data was a part of the experiment parameters.
221
+
222
+ - **Developed by:** pseudoterminal
223
+ - **Model type:** Diffusion-based text-to-image generation model
224
+ - **Language(s)**: English
225
+ - **License:** creativeml-openrail-m
226
+ - **Parent Model:** https://huggingface.co/ptx0/pseudo-real-beta
227
+ - **Resources for more information:** More information needed
228
+
229
+ # Uses
230
+
231
+ - see https://huggingface.co/stabilityai/stable-diffusion-2-1
232
+
233
+
234
+ # Training Details
235
+
236
+ ## Training Data
237
+
238
+ - LAION HD dataset subsets
239
+ - https://huggingface.co/datasets/laion/laion-high-resolution
240
+ We only used a small portion of that, see [Preprocessing](#preprocessing)
241
+
242
+ ### Preprocessing
243
+
244
+ All pre-processing is done via the scripts in `bghira/SimpleTuner` on GitHub.
245
+
246
+ ### Speeds, Sizes, Times
247
+
248
+ - Dataset size: 100k image-caption pairs, after filtering.
249
+
250
+ - Hardware: 1 A100 80G GPUs
251
+
252
+ - Optimizer: 8bit Adam
253
+
254
+ - Batch size: 150
255
+ - actual batch size: 15
256
+ - gradient_accumulation_steps: 10
257
+ - effective batch size: 150
258
+
259
+ - Learning rate: Constant 4e-8 which was adjusted by reducing batch size over time.
260
+
261
+ - Training steps: WIP (ongoing)
262
+
263
+ - Training time: approximately 4 days (so far)
264
+
265
+ ## Results
266
+
267
+ More information needed
268
+
269
+ # Model Card Authors
270
+
271
+ pseudoterminal
272
+
273
+
274
+ # How to Get Started with the Model
275
+
276
+ Use the code below to get started with the model.
277
+
278
+
279
+ ```python
280
+ # Use Pytorch 2!
281
+ import torch
282
+ from diffusers import StableDiffusionPipeline, DiffusionPipeline, AutoencoderKL, UNet2DConditionModel, DDPMScheduler
283
+ from transformers import CLIPTextModel
284
+
285
+ # Any model currently on Huggingface Hub.
286
+ model_id = 'ptx0/pseudo-flex-base'
287
+ pipeline = DiffusionPipeline.from_pretrained(model_id)
288
+
289
+ # Optimize!
290
+ pipeline.unet = torch.compile(pipeline.unet)
291
+ scheduler = DDPMScheduler.from_pretrained(
292
+ model_id,
293
+ subfolder="scheduler"
294
+ )
295
+
296
+ # Remove this if you get an error.
297
+ torch.set_float32_matmul_precision('high')
298
+
299
+ pipeline.to('cuda')
300
+ prompts = {
301
+ "woman": "a woman, hanging out on the beach",
302
+ "man": "a man playing guitar in a park",
303
+ "lion": "Explore the ++majestic beauty++ of untamed ++lion prides++ as they roam the African plains --captivating expressions-- in the wildest national geographic adventure",
304
+ "child": "a child flying a kite on a sunny day",
305
+ "bear": "best quality ((bear)) in the swiss alps cinematic 8k highly detailed sharp focus intricate fur",
306
+ "alien": "an alien exploring the Mars surface",
307
+ "robot": "a robot serving coffee in a cafe",
308
+ "knight": "a knight protecting a castle",
309
+ "menn": "a group of smiling and happy men",
310
+ "bicycle": "a bicycle, on a mountainside, on a sunny day",
311
+ "cosmic": "cosmic entity, sitting in an impossible position, quantum reality, colours",
312
+ "wizard": "a mage wizard, bearded and gray hair, blue star hat with wand and mystical haze",
313
+ "wizarddd": "digital art, fantasy, portrait of an old wizard, detailed",
314
+ "macro": "a dramatic city-scape at sunset or sunrise",
315
+ "micro": "RNA and other molecular machinery of life",
316
+ "gecko": "a leopard gecko stalking a cricket"
317
+ }
318
+ for shortname, prompt in prompts.items():
319
+ # old prompt: ''
320
+ image = pipeline(prompt=prompt,
321
+ negative_prompt='malformed, disgusting, overexposed, washed-out',
322
+ num_inference_steps=32, generator=torch.Generator(device='cuda').manual_seed(1641421826),
323
+ width=1368, height=720, guidance_scale=7.5, guidance_rescale=0.3, num_inference_steps=25).images[0]
324
+ image.save(f'test/{shortname}_nobetas.png', format="PNG")
325
+ ```
assets/.keep ADDED
File without changes
assets/banner.png ADDED

Git LFS Details

  • SHA256: 7da5ede9eab30bf5490fd2961166bc7662127e09b68b6c23039d74e885a7eed1
  • Pointer size: 132 Bytes
  • Size of remote file: 1.11 MB
assets/dark-base.png ADDED

Git LFS Details

  • SHA256: bf47584aa3bfe74f709b9ee2f709ea0119bf6a7c4ca826f68c5fe6483fb27445
  • Pointer size: 132 Bytes
  • Size of remote file: 1.66 MB
assets/dark-flex.png ADDED

Git LFS Details

  • SHA256: 761193fdb256ff8d2848b2983720006a240f6d392085287088b69e6affed4fdd
  • Pointer size: 131 Bytes
  • Size of remote file: 601 kB
assets/dark-realism.png ADDED

Git LFS Details

  • SHA256: fefb8f4ee2eea2041c19b449997d14633f77c1e87bd43e0afa1e2815beb8f2c4
  • Pointer size: 132 Bytes
  • Size of remote file: 1.25 MB
assets/ellen-base.png ADDED

Git LFS Details

  • SHA256: 113b029ffd65f34f9da140507b91e0eabc9595160ee352975d36c70625412314
  • Pointer size: 132 Bytes
  • Size of remote file: 1.47 MB
assets/ellen-flex.png ADDED

Git LFS Details

  • SHA256: 2b3b4ab9d1341e618113a7f744371023f4c099b9edd62424a0e883d25dba1b3b
  • Pointer size: 132 Bytes
  • Size of remote file: 1.15 MB
assets/ellen-realism.png ADDED

Git LFS Details

  • SHA256: abb71d4f3338c0c78ba5132132b90e8b6c5268291d2ab487fe4175ae9dc14753
  • Pointer size: 132 Bytes
  • Size of remote file: 1.39 MB
assets/fam-base.png ADDED

Git LFS Details

  • SHA256: 62f3a8a9336d1c896cdb0a866280cedce2091e0b37e9c9cf321537f0eb8eac0c
  • Pointer size: 132 Bytes
  • Size of remote file: 2.12 MB
assets/fam-flex.png ADDED

Git LFS Details

  • SHA256: bdc6cbc43a64cada596b5f629e7ea5d0eadc7aa9e30b82bc86527e467e658369
  • Pointer size: 132 Bytes
  • Size of remote file: 1.01 MB
assets/fam-realism.png ADDED

Git LFS Details

  • SHA256: 3fe33b99a22cf16db508c25bcee11d0f866e9e1b5d7e5be42600a107eb5b893a
  • Pointer size: 132 Bytes
  • Size of remote file: 1.13 MB
assets/woman-base.png ADDED

Git LFS Details

  • SHA256: 706c673fb1c361da7adc10e03bdd2714003ccef42b120caee1a34726fcd73b16
  • Pointer size: 132 Bytes
  • Size of remote file: 1.34 MB
assets/woman-flex.png ADDED

Git LFS Details

  • SHA256: 5e9ed6e2935de3605c99e79df3fd12b96a6143be0e03a242eed72f2c1f6721f2
  • Pointer size: 131 Bytes
  • Size of remote file: 529 kB
assets/woman-realism.png ADDED

Git LFS Details

  • SHA256: 3b27ae3ecaa955e3eecc968fc736c95c36b031dffbdad9edaa7009c6cce6bdeb
  • Pointer size: 132 Bytes
  • Size of remote file: 1.2 MB
ema_unet/config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4468b6b73594408a7f51643a8f67c29d29438b3ee625f2bbd34d76332f4e97ca
3
+ size 2029
ema_unet/diffusion_pytorch_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fc09380d13cd243be98c642b70afe8dd0e888943ab7bc54a076edfdedaa05483
3
+ size 3463726504
feature_extractor/preprocessor_config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4db495644e3e5bd8fcac52f70e7fc0b413c911086021acf73ac30e5911166e95
3
+ size 518
model_index.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:45b32dee83ba23fcf7d836f5970f8f982ede609892bd4002c89ddf61bc2469a5
3
+ size 543
optimizer.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d0e2d2ea0c77abd46362e23dcea4f54fc550ff68efe96a84314f9b5f918a6a4a
3
+ size 6927867604
random_states_0.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:352a0b48037d2149964afd7f9885e50ac94d559f8dcce75f1f47aca9d31a243a
3
+ size 14344
scheduler.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:da71f151a50994e7de5cbae45eafb922ef01bd084c5a3c45f65f9dc023e4ce1d
3
+ size 1000
scheduler/scheduler_config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f86909bc657068f979a38b6533d35dee417a4b08f2ae1b085bb991ed7685cb18
3
+ size 346
text_encoder/config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ab32fb24abe9e07afbb4d30cb6143c2ebde483dd209d59740d164a5d28f6c376
3
+ size 701
text_encoder/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:012bcb2fc34c59837cf9f13d28da60f45ddebb1ee0f0455e08d04a4735016045
3
+ size 1361597016
tokenizer/merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer/special_tokens_map.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f118ab3a983206e4f32583448de6bd6aae4ee21869135cef1f5848a753cdaab6
3
+ size 460
tokenizer/tokenizer_config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:19d7b034cb0cc3ce9766c2231373ab8aa8991fc72e2c8f76558bfaae3de0d563
3
+ size 737
tokenizer/vocab.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e089ad92ba36837a0d31433e555c8f45fe601ab5c221d4f607ded32d9f7a4349
3
+ size 1059962
training_state-anatomy.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e148b0c288a31437a99f227900ac4ad7a7eced86c3a440ad3748e49d7bf971a0
3
+ size 2082031
training_state-cinemamix-1mp.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:57757f3c983bd49a59369a585140f35d24388a107559b4b7a9de99194f0e20ae
3
+ size 1055585
training_state-mj-60.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4fa7baf925f95f9519321101dd763bff0906548754bb1cfa52b719317b996914
3
+ size 45936912
training_state-nsfw-1024.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d6976e3789a9eee0127e140e5970b0ac5be9628d04333ce715fe1fc1672054a2
3
+ size 782526
training_state-photo-aesthetics.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f998606ecaec6243a0b982d3dbba63829874d44f564a32658bcd86711c3fbe90
3
+ size 6948163
training_state-shutterstock.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ba6652798531e37f909500ca9643eab46b8608d82801eebc62046d158e16b772
3
+ size 3263336
training_state-sports.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:010b9de55c47e7ce80edc785f0e331e9d72b8a0e47e7dc09d4c0a9e2e72b2841
3
+ size 117672
training_state-text-1mp.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7dc37cdcb2512c8d486e72188196e77a722ea49f283ea6795b7ea3819b0a3e2c
3
+ size 1536598
training_state-yoga.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:98e2e545c6e8fc215e04d8815a0a06e8eeabe3e03c3d172ccd26b3b8e5d64968
3
+ size 538833
training_state.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5e394183cbd5c365fb4e0c51db3c8ff939f379c7df3d26eb4dc8a80f59212458
3
+ size 310
unet/config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3197c08397f77e403b54b8bc16707b7e06d0a9c1d88e13419c5abeaa982e22b8
3
+ size 1856
unet/diffusion_pytorch_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7be172ec2bcbf3d279034b6520b85f89a6108944526b2fde382341c57b035cdd
3
+ size 3463726504
vae/config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:95d23ed9665de3ea1094f518e43ab731728721dc8ae2773d5f9b56118cca8e1d
3
+ size 719
vae/diffusion_pytorch_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2aa1f43011b553a4cba7f37456465cdbd48aab7b54b9348b890e8058ea7683ec
3
+ size 334643268