ptx0 commited on
Commit
d81a740
1 Parent(s): 1d22f71

Trained for 0 epochs and 1500 steps.

Browse files

Trained with datasets ['text-embeds-sd2x', 'celebrities', 'movieposters', 'normalnudes', 'propagandaposters', 'guys', 'pixel-art', 'signs', 'moviecollection', 'bookcovers', 'nijijourney', 'experimental', 'ethnic', 'sports', 'gay', 'architecture', 'shutterstock', 'cinemamix-1mp', 'nsfw-1024', 'anatomy', 'bg20k-1024', 'yoga']
Learning rate 4e-07, batch size 4, and 1 gradient accumulation steps.
Used DDPM noise scheduler for training with v_prediction prediction type and rescaled_betas_zero_snr=True
Using 'trailing' timestep spacing.
Base model: stabilityai/stable-diffusion-2-1
VAE: None

README.md CHANGED
@@ -1,325 +1,229 @@
1
  ---
2
  license: creativeml-openrail-m
 
3
  tags:
4
- - stable-diffusion
5
- - stable-diffusion-2-1
6
- - text-to-image
7
- pinned: true
8
- library_name: diffusers
9
- ---
10
-
11
- # Model Card for pseudo-flex-base (1024x1024 base resolution)
12
- ![img](assets/banner.png)
13
- <!-- Provide a quick summary of what the model is/does. [Optional] -->
14
- stable-diffusion-2-1 (stabilityai/stable-diffusion-2-1) finetuned with different aspect ratios, into a photography model (ptx0/pseudo-real-beta).
15
-
16
- ## Sample images
17
-
18
- **Seed**: 2695929547
19
-
20
- **Steps**: 25
21
-
22
- **Sampler**: DDIM, default model config settings
23
-
24
- **Version**: Pytorch 2.0.1, Diffusers 0.17.1
25
-
26
- **Guidance**: 9.2
27
-
28
- **Guidance rescale**: 0.0
29
-
30
- | resolution | model | stable diffusion | pseudo-flex | realism-engine |
31
- |:---------------:|:-------:|:------------------------------:|:-------------------------------:|:---------------------------------:
32
- | 753x1004 (4:3) | v2-1 | ![img](assets/fam-base.png) | ![img](assets/fam-flex.png) | ![img](assets/fam-realism.png) |
33
- | 1280x720 (16:9) | v2-1 | ![img](assets/ellen-base.png) | ![img](assets/ellen-flex.png) | ![img](assets/ellen-realism.png) |
34
- | 1024x1024 (1:1) | v2-1 | ![img](assets/woman-base.png) | ![img](assets/woman-flex.png) | ![img](assets/woman-realism.png) |
35
- | 1024x1024 (1:1) | v2-1 | ![img](assets/dark-base.png) | ![img](assets/dark-flex.png) | ![img](assets/dark-realism.png) |
36
-
37
-
38
- ## Background
39
-
40
- The `ptx0/pseudo-real-beta` pretrained checkpoint had its unet trained for 4,200 steps and its text encoder trained for 15,600 steps at a batch size of 15 with 10 gradient accumulations, on a diverse dataset:
41
-
42
- * cushman (8000 kodachrome slides from 1939 to 1969)
43
- * midjourney v5.1-filtered (about 22,000 upscaled v5.1 images)
44
- * national geographic (about 3-4,000 >1024x768 images of animals, wildlife, landscapes, history)
45
- * a small dataset of stock images of people vaping / smoking
46
-
47
- It has a diverse capability of photorealistic and adventure with strong prompt coherence. However, it lacks multi-aspect capability.
48
-
49
- The code used to train `pseudo-real-beta` did not have aspect bucketing support. I discovered `pseudo-flex-base` by @ttj, which supported theories I had.
50
-
51
- ## Training code
52
-
53
- I added thorough aspect bucketing support to my training loop dataloader by having it throw away any image under 1024x1024, and condition all images so that the smaller side of the image is 1024. The aspect ratio of the image is used to determine the new length of the other dimension, eg. used as a multiple for landscape or a divisor for portrait mode.
54
-
55
- All batches have image of the same resolution. Different resolutions at the same aspect are all conditioned to 1024x... or ...x1024. A 1920x1080 image becomes approx 1820x1024.
56
-
57
- ## Starting checkpoint
58
-
59
- This model, `pseudo-flex-base` was created by fine-tuning the base `stabilityai/stable-diffusion-2-1` 768 model on its frozen text encoder, for 1000 steps on 148,000 images from LAION HD using the TEXT field as their caption.
60
-
61
- The batch size was effectively 150 again. Batch size of 15 with 10 accumulations. This is very slow at very high resolutions, an aspect ratio of 1.5-1.7 will cause this to take about 700 seconds per iter on an A100 80G.
62
-
63
- This training took two days.
64
-
65
- ## Text encoder swap
66
-
67
- At 1000 steps, the text encoder from `ptx0/pseudo-real-beta` was used experimentally with this model's unet in an attempt to resolve some residual image noise, eg. pixelation. That worked!
68
-
69
- The training was restarted from ckpt 1000 with this text encoder.
70
-
71
- ## The beginnings of wide / portrait aspect appearing
72
-
73
- Validation prompts began to "pull together" from 1300 to 2950 steps. Some checkpoints show regression, but these usually resolve in about 100 steps. Improvements were always present, despite regresions.
74
-
75
- ## Degradation and dataset swap
76
-
77
- As training has been going on for some time now on 148,000 images at a batch size of 150 over 3000 steps, images began to degrade. This is presumably due to having completed 3 repeats on all images in the set, and that's IF all images in the set had been used. Considering some of the image filters discarded about 50,000 images, we landed at 9 repeats per image on our super low learning rate.
78
-
79
- This caused two issues:
80
-
81
- * The images were beginning to show static noise.
82
- * The training was taking a very long time, and each checkpoint showed little improvement.
83
- * Overfitting to prompt vocabulary, and a lack of generalization.
84
-
85
- Ergo, at 1300 steps, the decision was made to cease training on the original LAION HD dataset, and instead, train on a *new* freshly-retrieved subset of high-resolution Midjourney v5.1 data.
86
-
87
- This consisted of 17,800 images at a base resolution of 1024x1024, with about 700 samples in portrait and 700 samples in landscape.
88
-
89
- ## Contrast issues
90
-
91
- As the checkpoint 3275 was tested, a common observation was that darker images were washed out, and brighter images seemed "meh".
92
-
93
- Various CFG rescale and guidance levels were tested, with the best dark images occurring around `guidance_scale=9.2` and `guidance_rescale=0.0` but they remained "washed out".
94
-
95
- ## Dataset change number two
96
-
97
- A new LAION subset was prepared with unique images and no square images - just a limited collection of aspect ratios:
98
-
99
- * 16:9
100
- * 9:16
101
- * 2:3
102
- * 3:2
103
-
104
- This was intended to speed up the understanding of the model, and prevent overfitting on captions.
105
-
106
- This LAION subset contained 17,800 images, evenly distributed through aspect ratios.
107
-
108
- The images were then captioned using T5 Flan with BLIP2, to obtain highly accurate results.
109
-
110
- ## Contrast fix: offset noise / SNR gamma to the rescue?
111
-
112
- Offset noise and SNR gamma were applied experimentally to the checkpoint **4250**:
113
-
114
- * `snr_gamma=5.0`
115
- * `noise_offset=0.2`
116
- * `noise_pertubation=0.1`
117
-
118
- Within 25 steps of training, the contrast was back, and the prompt `a solid black square` once again produced a reasonable result.
119
-
120
- At 50 steps of offset noise, things really seemed to "click" and `a solid black square` had the fewest deformities I've seen.
121
-
122
- Step 75 checkpoint was broken. The SNR gamma math results in numeric instability and was disabled. The offset noise parameters were untouched.
123
-
124
- ## Success! Improvement in quality and contrast.
125
 
126
- Similar to the text encoder swap, the images showed a marked improvement over the next several checkpoints.
127
 
128
- It was left to its own devices, and at step 4475, enough improvement was observed that another revision in this repository was created.
129
-
130
-
131
- # Status: Test release
132
-
133
- This model has been packaged up in a test form so that it can be thoroughly assessed by users.
134
-
135
- For usage, see - [How to Get Started with the Model](#how-to-get-started-with-the-model)
136
-
137
- ### It aims to solve the following issues:
138
-
139
- 1. Generated images looks like they are cropped from a larger image.
140
-
141
- 2. Generating non-square images creates weird results, due to the model being trained on square images.
142
-
143
-
144
- ### Limitations:
145
- 1. It's trained on a small dataset, so its improvements may be limited.
146
- 2. The model architecture of SD 2.1 is older than SDXL, and will not generate comparably good results.
147
-
148
- For 1:1 aspect ratio, it's fine-tuned at 1024x1024, although `ptx0/pseudo-real-beta` that it was based on, was last finetuned at 768x768.
149
-
150
- ### Potential improvements:
151
- 1. Train on a captioned dataset. This model used the TEXT field from LAION for convenience, though COCO-generated captions would be superior.
152
- 2. Train the text encoder on large images.
153
- 3. Periodic caption drop-out enforced to help condition classifier-free guidance capabilities.
154
-
155
-
156
- # Table of Contents
157
-
158
- - [Model Card for pseudo-flex-base](#model-card-for--model_id-)
159
- - [Table of Contents](#table-of-contents)
160
- - [Table of Contents](#table-of-contents-1)
161
- - [Model Details](#model-details)
162
- - [Model Description](#model-description)
163
- - [Uses](#uses)
164
- - [Direct Use](#direct-use)
165
- - [Downstream Use [Optional]](#downstream-use-optional)
166
- - [Out-of-Scope Use](#out-of-scope-use)
167
- - [Bias, Risks, and Limitations](#bias-risks-and-limitations)
168
- - [Recommendations](#recommendations)
169
- - [Training Details](#training-details)
170
- - [Training Data](#training-data)
171
- - [Training Procedure](#training-procedure)
172
- - [Preprocessing](#preprocessing)
173
- - [Speeds, Sizes, Times](#speeds-sizes-times)
174
- - [Evaluation](#evaluation)
175
- - [Testing Data, Factors & Metrics](#testing-data-factors--metrics)
176
- - [Testing Data](#testing-data)
177
- - [Factors](#factors)
178
- - [Metrics](#metrics)
179
- - [Results](#results)
180
- - [Model Examination](#model-examination)
181
- - [Environmental Impact](#environmental-impact)
182
- - [Technical Specifications [optional]](#technical-specifications-optional)
183
- - [Model Architecture and Objective](#model-architecture-and-objective)
184
- - [Compute Infrastructure](#compute-infrastructure)
185
- - [Hardware](#hardware)
186
- - [Software](#software)
187
- - [Citation](#citation)
188
- - [Glossary [optional]](#glossary-optional)
189
- - [More Information [optional]](#more-information-optional)
190
- - [Model Card Authors [optional]](#model-card-authors-optional)
191
- - [Model Card Contact](#model-card-contact)
192
- - [How to Get Started with the Model](#how-to-get-started-with-the-model)
193
-
194
-
195
- # Model Details
196
-
197
- ## Model Description
198
-
199
- <!-- Provide a longer summary of what this model is/does. -->
200
- stable-diffusion-2-1 (stabilityai/stable-diffusion-2-1 and ptx0/pseudo-real-beta) finetuned for dynamic aspect ratios.
201
-
202
- finetuned resolutions:
203
- | | width | height | aspect ratio | images |
204
- |---:|--------:|---------:|:--------------|-------:|
205
- | 0 | 1024 | 1024 | 1:1 | 90561 |
206
- | 1 | 1536 | 1024 | 3:2 | 8716 |
207
- | 2 | 1365 | 1024 | 4:3 | 6933 |
208
- | 3 | 1468 | 1024 | ~3:2 | 113 |
209
- | 4 | 1778 | 1024 | ~5:3 | 6315 |
210
- | 5 | 1200 | 1024 | ~5:4 | 6376 |
211
- | 6 | 1333 | 1024 | ~4:3 | 2814 |
212
- | 7 | 1281 | 1024 | ~5:4 | 52 |
213
- | 8 | 1504 | 1024 | ~3:2 | 139 |
214
- | 9 | 1479 | 1024 | ~3:2 | 25 |
215
- | 10 | 1384 | 1024 | ~4:3 | 1676 |
216
- | 11 | 1370 | 1024 | ~4:3 | 63 |
217
- | 12 | 1499 | 1024 | ~3:2 | 436 |
218
- | 13 | 1376 | 1024 | ~4:3 | 68 |
219
-
220
- Other aspects were in smaller buckets. It could have been done more succinctly or carefully, but careless handling of the data was a part of the experiment parameters.
221
-
222
- - **Developed by:** pseudoterminal
223
- - **Model type:** Diffusion-based text-to-image generation model
224
- - **Language(s)**: English
225
- - **License:** creativeml-openrail-m
226
- - **Parent Model:** https://huggingface.co/ptx0/pseudo-real-beta
227
- - **Resources for more information:** More information needed
228
-
229
- # Uses
230
-
231
- - see https://huggingface.co/stabilityai/stable-diffusion-2-1
232
-
233
-
234
- # Training Details
235
-
236
- ## Training Data
237
-
238
- - LAION HD dataset subsets
239
- - https://huggingface.co/datasets/laion/laion-high-resolution
240
- We only used a small portion of that, see [Preprocessing](#preprocessing)
241
-
242
- ### Preprocessing
243
-
244
- All pre-processing is done via the scripts in `bghira/SimpleTuner` on GitHub.
245
-
246
- ### Speeds, Sizes, Times
247
-
248
- - Dataset size: 100k image-caption pairs, after filtering.
249
-
250
- - Hardware: 1 A100 80G GPUs
251
-
252
- - Optimizer: 8bit Adam
253
-
254
- - Batch size: 150
255
- - actual batch size: 15
256
- - gradient_accumulation_steps: 10
257
- - effective batch size: 150
258
-
259
- - Learning rate: Constant 4e-8 which was adjusted by reducing batch size over time.
260
-
261
- - Training steps: WIP (ongoing)
262
-
263
- - Training time: approximately 4 days (so far)
264
-
265
- ## Results
266
-
267
- More information needed
268
-
269
- # Model Card Authors
270
-
271
- pseudoterminal
272
-
273
-
274
- # How to Get Started with the Model
275
-
276
- Use the code below to get started with the model.
277
-
278
-
279
- ```python
280
- # Use Pytorch 2!
281
- import torch
282
- from diffusers import StableDiffusionPipeline, DiffusionPipeline, AutoencoderKL, UNet2DConditionModel, DDPMScheduler
283
- from transformers import CLIPTextModel
284
-
285
- # Any model currently on Huggingface Hub.
286
- model_id = 'ptx0/pseudo-flex-base'
287
- pipeline = DiffusionPipeline.from_pretrained(model_id)
288
-
289
- # Optimize!
290
- pipeline.unet = torch.compile(pipeline.unet)
291
- scheduler = DDPMScheduler.from_pretrained(
292
- model_id,
293
- subfolder="scheduler"
294
- )
295
 
296
- # Remove this if you get an error.
297
- torch.set_float32_matmul_precision('high')
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
298
 
299
- pipeline.to('cuda')
300
- prompts = {
301
- "woman": "a woman, hanging out on the beach",
302
- "man": "a man playing guitar in a park",
303
- "lion": "Explore the ++majestic beauty++ of untamed ++lion prides++ as they roam the African plains --captivating expressions-- in the wildest national geographic adventure",
304
- "child": "a child flying a kite on a sunny day",
305
- "bear": "best quality ((bear)) in the swiss alps cinematic 8k highly detailed sharp focus intricate fur",
306
- "alien": "an alien exploring the Mars surface",
307
- "robot": "a robot serving coffee in a cafe",
308
- "knight": "a knight protecting a castle",
309
- "menn": "a group of smiling and happy men",
310
- "bicycle": "a bicycle, on a mountainside, on a sunny day",
311
- "cosmic": "cosmic entity, sitting in an impossible position, quantum reality, colours",
312
- "wizard": "a mage wizard, bearded and gray hair, blue star hat with wand and mystical haze",
313
- "wizarddd": "digital art, fantasy, portrait of an old wizard, detailed",
314
- "macro": "a dramatic city-scape at sunset or sunrise",
315
- "micro": "RNA and other molecular machinery of life",
316
- "gecko": "a leopard gecko stalking a cricket"
317
- }
318
- for shortname, prompt in prompts.items():
319
- # old prompt: ''
320
- image = pipeline(prompt=prompt,
321
- negative_prompt='malformed, disgusting, overexposed, washed-out',
322
- num_inference_steps=32, generator=torch.Generator(device='cuda').manual_seed(1641421826),
323
- width=1368, height=720, guidance_scale=7.5, guidance_rescale=0.3, num_inference_steps=25).images[0]
324
- image.save(f'test/{shortname}_nobetas.png', format="PNG")
325
- ```
 
1
  ---
2
  license: creativeml-openrail-m
3
+ base_model: "stabilityai/stable-diffusion-2-1"
4
  tags:
5
+ - stable-diffusion
6
+ - stable-diffusion-diffusers
7
+ - text-to-image
8
+ - diffusers
9
+ - full
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
 
11
+ inference: true
12
 
13
+ ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
+ # pseudo-flex-v2
16
+
17
+ This is a full rank finetuned model derived from [stabilityai/stable-diffusion-2-1](https://huggingface.co/stabilityai/stable-diffusion-2-1).
18
+
19
+ The main validation prompt used during training was:
20
+
21
+ ```
22
+ a cinematic scene from the film Rogue One, a woman stares off into the distance, holding a sign that reads SOON
23
+ ```
24
+
25
+ ## Validation settings
26
+ - CFG: `9.2`
27
+ - CFG Rescale: `0.7`
28
+ - Steps: `30`
29
+ - Sampler: `euler`
30
+ - Seed: `420420420`
31
+ - Resolutions: `1024,1280x768,832x1216`
32
+
33
+ Note: The validation settings are not necessarily the same as the [training settings](#training-settings).
34
+
35
+
36
+
37
+
38
+ <Gallery />
39
+
40
+ The text encoder **was not** trained.
41
+ You may reuse the base model text encoder for inference.
42
+
43
+
44
+ ## Training settings
45
+
46
+ - Training epochs: 0
47
+ - Training steps: 1500
48
+ - Learning rate: 4e-07
49
+ - Effective batch size: 4
50
+ - Micro-batch size: 4
51
+ - Gradient accumulation steps: 1
52
+ - Prediction type: v_prediction
53
+ - Rescaled betas zero SNR: True
54
+ - Optimizer: AdamW, stochastic bf16
55
+ - Precision: Pure BF16
56
+ - Xformers: Enabled
57
+
58
+
59
+ ## Datasets
60
+
61
+ ### celebrities
62
+ - Repeats: 0
63
+ - Total number of images: 1216
64
+ - Total number of aspect buckets: 22
65
+ - Resolution: 1.0 megapixels
66
+ - Cropped: False
67
+ - Crop style: None
68
+ - Crop aspect: None
69
+ ### movieposters
70
+ - Repeats: 0
71
+ - Total number of images: 1684
72
+ - Total number of aspect buckets: 17
73
+ - Resolution: 1.0 megapixels
74
+ - Cropped: False
75
+ - Crop style: None
76
+ - Crop aspect: None
77
+ ### normalnudes
78
+ - Repeats: 0
79
+ - Total number of images: 1064
80
+ - Total number of aspect buckets: 26
81
+ - Resolution: 1.0 megapixels
82
+ - Cropped: False
83
+ - Crop style: None
84
+ - Crop aspect: None
85
+ ### propagandaposters
86
+ - Repeats: 0
87
+ - Total number of images: 616
88
+ - Total number of aspect buckets: 19
89
+ - Resolution: 1.0 megapixels
90
+ - Cropped: False
91
+ - Crop style: None
92
+ - Crop aspect: None
93
+ ### guys
94
+ - Repeats: 0
95
+ - Total number of images: 356
96
+ - Total number of aspect buckets: 17
97
+ - Resolution: 1.0 megapixels
98
+ - Cropped: False
99
+ - Crop style: None
100
+ - Crop aspect: None
101
+ ### pixel-art
102
+ - Repeats: 0
103
+ - Total number of images: 1308
104
+ - Total number of aspect buckets: 23
105
+ - Resolution: 0.5 megapixels
106
+ - Cropped: False
107
+ - Crop style: None
108
+ - Crop aspect: None
109
+ ### signs
110
+ - Repeats: 0
111
+ - Total number of images: 660
112
+ - Total number of aspect buckets: 16
113
+ - Resolution: 0.5 megapixels
114
+ - Cropped: False
115
+ - Crop style: None
116
+ - Crop aspect: None
117
+ ### moviecollection
118
+ - Repeats: 0
119
+ - Total number of images: 1856
120
+ - Total number of aspect buckets: 26
121
+ - Resolution: 1.0 megapixels
122
+ - Cropped: False
123
+ - Crop style: None
124
+ - Crop aspect: None
125
+ ### bookcovers
126
+ - Repeats: 0
127
+ - Total number of images: 776
128
+ - Total number of aspect buckets: 18
129
+ - Resolution: 1.0 megapixels
130
+ - Cropped: False
131
+ - Crop style: None
132
+ - Crop aspect: None
133
+ ### nijijourney
134
+ - Repeats: 0
135
+ - Total number of images: 616
136
+ - Total number of aspect buckets: 19
137
+ - Resolution: 1.0 megapixels
138
+ - Cropped: False
139
+ - Crop style: None
140
+ - Crop aspect: None
141
+ ### experimental
142
+ - Repeats: 0
143
+ - Total number of images: 3000
144
+ - Total number of aspect buckets: 17
145
+ - Resolution: 1.0 megapixels
146
+ - Cropped: False
147
+ - Crop style: None
148
+ - Crop aspect: None
149
+ ### ethnic
150
+ - Repeats: 0
151
+ - Total number of images: 3056
152
+ - Total number of aspect buckets: 23
153
+ - Resolution: 1.0 megapixels
154
+ - Cropped: False
155
+ - Crop style: None
156
+ - Crop aspect: None
157
+ ### sports
158
+ - Repeats: 0
159
+ - Total number of images: 772
160
+ - Total number of aspect buckets: 10
161
+ - Resolution: 1.0 megapixels
162
+ - Cropped: False
163
+ - Crop style: None
164
+ - Crop aspect: None
165
+ ### gay
166
+ - Repeats: 0
167
+ - Total number of images: 1064
168
+ - Total number of aspect buckets: 16
169
+ - Resolution: 1.0 megapixels
170
+ - Cropped: False
171
+ - Crop style: None
172
+ - Crop aspect: None
173
+ ### architecture
174
+ - Repeats: 0
175
+ - Total number of images: 4316
176
+ - Total number of aspect buckets: 22
177
+ - Resolution: 1.0 megapixels
178
+ - Cropped: False
179
+ - Crop style: None
180
+ - Crop aspect: None
181
+ ### shutterstock
182
+ - Repeats: 0
183
+ - Total number of images: 21008
184
+ - Total number of aspect buckets: 33
185
+ - Resolution: 1.0 megapixels
186
+ - Cropped: False
187
+ - Crop style: None
188
+ - Crop aspect: None
189
+ ### cinemamix-1mp
190
+ - Repeats: 0
191
+ - Total number of images: 9024
192
+ - Total number of aspect buckets: 6
193
+ - Resolution: 1.0 megapixels
194
+ - Cropped: False
195
+ - Crop style: None
196
+ - Crop aspect: None
197
+ ### nsfw-1024
198
+ - Repeats: 0
199
+ - Total number of images: 10792
200
+ - Total number of aspect buckets: 6
201
+ - Resolution: 1.0 megapixels
202
+ - Cropped: False
203
+ - Crop style: None
204
+ - Crop aspect: None
205
+ ### anatomy
206
+ - Repeats: 5
207
+ - Total number of images: 16376
208
+ - Total number of aspect buckets: 25
209
+ - Resolution: 1.0 megapixels
210
+ - Cropped: False
211
+ - Crop style: None
212
+ - Crop aspect: None
213
+ ### bg20k-1024
214
+ - Repeats: 0
215
+ - Total number of images: 89216
216
+ - Total number of aspect buckets: 42
217
+ - Resolution: 1.0 megapixels
218
+ - Cropped: False
219
+ - Crop style: None
220
+ - Crop aspect: None
221
+ ### yoga
222
+ - Repeats: 0
223
+ - Total number of images: 3576
224
+ - Total number of aspect buckets: 21
225
+ - Resolution: 1.0 megapixels
226
+ - Cropped: False
227
+ - Crop style: None
228
+ - Crop aspect: None
229
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
optimizer.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d0e2d2ea0c77abd46362e23dcea4f54fc550ff68efe96a84314f9b5f918a6a4a
3
- size 6927867604
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:72ef647317a67d0355e407fec07791099391146408647913aef1db74cc72ff4d
3
+ size 5196082660
random_states_0.pkl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:352a0b48037d2149964afd7f9885e50ac94d559f8dcce75f1f47aca9d31a243a
3
- size 14344
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d9be15cc983834b02dee0e1bb16d55722dee578e1233887a72fae2b9ec92c9a2
3
+ size 14408
scheduler.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:da71f151a50994e7de5cbae45eafb922ef01bd084c5a3c45f65f9dc023e4ce1d
3
  size 1000
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:903228551a0e74e8357a525e68a6ad53fe382087e1795fac447cb265eca01872
3
  size 1000
training_state-anatomy.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e148b0c288a31437a99f227900ac4ad7a7eced86c3a440ad3748e49d7bf971a0
3
- size 2082031
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:25d6a769be31a6e2c9ee354b880346672a1ebb1d3e75bc3185ddf2a620bd6aa8
3
+ size 2224777
training_state-architecture.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ae0ebd149de13b0960bfab1b3289d81248acdb12b425fdf72fcbe7bb0be3f19d
3
+ size 226450
training_state-bg20k-1024.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:75a2d5a533d26ef161cbd624da330f8a0d8746567cf41554e2429c0660c8d3d1
3
+ size 14685678
training_state-bookcovers.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cd6033ecc2cf7619a1af4b01d151f75bf418d64de30beb3c30367fa216e1c142
3
+ size 29030
training_state-celebrities.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:06c370cce125a3c2779d86c1a80fd141407a23206934164d92c43a4c9e1eb296
3
+ size 34193
training_state-cinemamix-1mp.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:57757f3c983bd49a59369a585140f35d24388a107559b4b7a9de99194f0e20ae
3
- size 1055585
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:79de8a1266fb85bf90f2689a6c08545c8d2d928e6d5711738462df1dd5f686c5
3
+ size 1120860
training_state-ethnic.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5fa9be25106419d42c54d8c5fd450686c1ce13246a26e592b0ac24c0fe9bfedc
3
+ size 765021
training_state-experimental.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:086d57ff7fece50a46e686b590cb649aa8a5284000f7a06628da73718c3f2fd7
3
+ size 391587
training_state-gay.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:95aa82031bb92f730261593f0b449d1008c017f3c9fc5f3c50164d78b3ca410a
3
+ size 239727
training_state-guys.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:245b44e512d5613d9194b19677922322d00813b1397fd237d7369220c7593507
3
+ size 17231
training_state-moviecollection.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6a6ab824addc356560a7ec13f3fadf0ec0d57b6dfecefab8dfb2b103a1f322d6
3
+ size 49556
training_state-movieposters.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7d0d617267eee5c89e40b8ea6ec8f5c8e7d5011fbb03bd2654edf0492d01adbd
3
+ size 46296
training_state-nijijourney.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2d3768a8f95f4b18f9871d21be1357073115c1b6b285daabe480da3966f8d5f7
3
+ size 25982
training_state-normalnudes.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5e5445c71ad9ca4a94f77de1187725d101fefd130a1dfac2f634e1b7e47f65da
3
+ size 32344
training_state-nsfw-1024.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d6976e3789a9eee0127e140e5970b0ac5be9628d04333ce715fe1fc1672054a2
3
- size 782526
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1c1524cd2630e38eb3b80fdea8f92ebb919dc779b031d9d9ffa90a08eee1424b
3
+ size 794425
training_state-pixel-art.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:20ea83a1f732bfe6e97ac5603b147125f4f73dbee90e7f3af55f0adb8d6c8eb6
3
+ size 37679
training_state-propagandaposters.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ce5902bddc1779dacee215e77e6d1cf41f4dd99140bc03dca5f8046124f06a67
3
+ size 22789
training_state-shutterstock.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ba6652798531e37f909500ca9643eab46b8608d82801eebc62046d158e16b772
3
- size 3263336
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:52994806e5159a1603a56a2a4fed5056d45acdeeb962e34cd58f8f2a8f2e4fae
3
+ size 4490331
training_state-signs.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ab671380e48662461f6444fae601f8aeb9c56ac23b4bc7e98b6bd74ad85d0ba5
3
+ size 24097
training_state-sports.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:010b9de55c47e7ce80edc785f0e331e9d72b8a0e47e7dc09d4c0a9e2e72b2841
3
- size 117672
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ee41114fb761754f3919e2b25dedbc8df306a5c39c8e3ba04e92eed41ff2a92f
3
+ size 164769
training_state-yoga.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:98e2e545c6e8fc215e04d8815a0a06e8eeabe3e03c3d172ccd26b3b8e5d64968
3
- size 538833
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d209a16b91855e2891e11bea5cdce6ecbb3213f43000a8aaf7705656303b1c89
3
+ size 676113
training_state.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5e394183cbd5c365fb4e0c51db3c8ff939f379c7df3d26eb4dc8a80f59212458
3
- size 310
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:266a962c7e54fccb92caeca56b77b7b80f4874ceec3c43ba4d6ef0f63e504d49
3
+ size 94
unet/config.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3197c08397f77e403b54b8bc16707b7e06d0a9c1d88e13419c5abeaa982e22b8
3
- size 1856
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5155c6ff59d421a43da52a0c6ef57af9fd1d355597156acf7d901634bcf9547e
3
+ size 1877
unet/diffusion_pytorch_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9ce4cae7ca81dd47d43fe0370996297fb555ec183d04a68885117009f6565964
3
- size 3463726504
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e57f26e3279a7ad2b48c88b3c5e31068289071a5277a668944882d868e56fb63
3
+ size 1731905416