davidrd123 commited on
Commit
a0bdcef
1 Parent(s): 2a82df6

Model card auto-generated by SimpleTuner

Browse files
Files changed (1) hide show
  1. README.md +259 -0
README.md ADDED
@@ -0,0 +1,259 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ base_model: "black-forest-labs/FLUX.1-dev"
4
+ tags:
5
+ - flux
6
+ - flux-diffusers
7
+ - text-to-image
8
+ - diffusers
9
+ - simpletuner
10
+ - safe-for-work
11
+ - lora
12
+ - template:sd-lora
13
+ - lycoris
14
+ inference: true
15
+ widget:
16
+ - text: 'unconditional (blank prompt)'
17
+ parameters:
18
+ negative_prompt: 'blurry, cropped, ugly'
19
+ output:
20
+ url: ./assets/image_0_0.png
21
+ - text: 'In the style of an Alfred Edmund Brehm illustration, Three stag beetles on oak bark, with one near green leaves at the top, another climbing vertically in the middle, and a third at the base amid fallen leaves and moss.'
22
+ parameters:
23
+ negative_prompt: 'blurry, cropped, ugly'
24
+ output:
25
+ url: ./assets/image_1_0.png
26
+ - text: 'In the style of an Alfred Edmund Brehm illustration, Four large moths around green leaves, one cream-colored, two brown with circular wing patterns, and one white moth in flight, with a pale caterpillar climbing on a leaf above.'
27
+ parameters:
28
+ negative_prompt: 'blurry, cropped, ugly'
29
+ output:
30
+ url: ./assets/image_2_0.png
31
+ - text: 'In the style of an Alfred Edmund Brehm illustration, A golden hamster sits upright on desert sand, its cheek pouches full of seeds. Three small scarab beetles move across the sand nearby, while a scorpion rests in the lower right corner.'
32
+ parameters:
33
+ negative_prompt: 'blurry, cropped, ugly'
34
+ output:
35
+ url: ./assets/image_3_0.png
36
+ - text: 'In the style of an Alfred Edmund Brehm illustration, A Range Rover in an African savanna setting, with two rhinoceros beetles on its front tire. Three dung beetles roll balls past its tracks in the dirt, while acacia trees stand in the background.'
37
+ parameters:
38
+ negative_prompt: 'blurry, cropped, ugly'
39
+ output:
40
+ url: ./assets/image_4_0.png
41
+ - text: 'In the style of an Alfred Edmund Brehm illustration, A glass Coca-Cola bottle lying sideways on brown leaves and soil. A line of black ants traverses its red label, two iridescent beetles explore the metal cap, and a pale moth rests on the glass neck.'
42
+ parameters:
43
+ negative_prompt: 'blurry, cropped, ugly'
44
+ output:
45
+ url: ./assets/image_5_0.png
46
+ - text: 'In the style of an Alfred Edmund Brehm illustration, Black over-ear headphones on a wooden table. Three small beetles crawl along the ear cushions, while a spider hangs between the headband adjusters, its web gleaming in the light.'
47
+ parameters:
48
+ negative_prompt: 'blurry, cropped, ugly'
49
+ output:
50
+ url: ./assets/image_6_0.png
51
+ - text: 'In the style of an Alfred Edmund Brehm illustration, A white athletic shoe on packed earth. Carpenter ants march through its eyelets, a beetle rests under the loosened tongue, and a cricket perches on the heel.'
52
+ parameters:
53
+ negative_prompt: 'blurry, cropped, ugly'
54
+ output:
55
+ url: ./assets/image_7_0.png
56
+ - text: 'In the style of an Alfred Edmund Brehm illustration, Three wooden pencils lying across a blank paper sheet. A praying mantis stands on one pencil tip, while two ladybugs explore graphite shavings scattered below.'
57
+ parameters:
58
+ negative_prompt: 'blurry, cropped, ugly'
59
+ output:
60
+ url: ./assets/image_8_0.png
61
+ ---
62
+
63
+ # AlbertBierstadt-Flux-LoKr
64
+
65
+ This is a LyCORIS adapter derived from [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev).
66
+
67
+
68
+ No validation prompt was used during training.
69
+
70
+ None
71
+
72
+
73
+
74
+ ## Validation settings
75
+ - CFG: `3.0`
76
+ - CFG Rescale: `0.0`
77
+ - Steps: `20`
78
+ - Sampler: `FlowMatchEulerDiscreteScheduler`
79
+ - Seed: `42`
80
+ - Resolution: `1024x1280`
81
+ - Skip-layer guidance:
82
+
83
+ Note: The validation settings are not necessarily the same as the [training settings](#training-settings).
84
+
85
+ You can find some example images in the following gallery:
86
+
87
+
88
+ <Gallery />
89
+
90
+ The text encoder **was not** trained.
91
+ You may reuse the base model text encoder for inference.
92
+
93
+
94
+ ## Training settings
95
+
96
+ - Training epochs: 0
97
+ - Training steps: 200
98
+ - Learning rate: 0.0004
99
+ - Learning rate schedule: polynomial
100
+ - Warmup steps: 200
101
+ - Max grad norm: 2.0
102
+ - Effective batch size: 3
103
+ - Micro-batch size: 3
104
+ - Gradient accumulation steps: 1
105
+ - Number of GPUs: 1
106
+ - Gradient checkpointing: True
107
+ - Prediction type: flow-matching (extra parameters=['shift=3', 'flux_guidance_mode=constant', 'flux_guidance_value=1.0', 'flow_matching_loss=compatible'])
108
+ - Optimizer: adamw_bf16
109
+ - Trainable parameter precision: Pure BF16
110
+ - Caption dropout probability: 10.0%
111
+
112
+ - SageAttention: Enabled inference
113
+ ### LyCORIS Config:
114
+ ```json
115
+ {
116
+ "algo": "lokr",
117
+ "multiplier": 1.0,
118
+ "linear_dim": 10000,
119
+ "linear_alpha": 1,
120
+ "factor": 16,
121
+ "apply_preset": {
122
+ "target_module": [
123
+ "Attention",
124
+ "FeedForward"
125
+ ],
126
+ "module_algo_map": {
127
+ "Attention": {
128
+ "factor": 16
129
+ },
130
+ "FeedForward": {
131
+ "factor": 8
132
+ }
133
+ }
134
+ }
135
+ }
136
+ ```
137
+
138
+ ## Datasets
139
+
140
+ ### ab-512
141
+ - Repeats: 12
142
+ - Total number of images: 74
143
+ - Total number of aspect buckets: 4
144
+ - Resolution: 0.262144 megapixels
145
+ - Cropped: False
146
+ - Crop style: None
147
+ - Crop aspect: None
148
+ - Used for regularisation data: No
149
+ ### ab-768
150
+ - Repeats: 8
151
+ - Total number of images: 74
152
+ - Total number of aspect buckets: 9
153
+ - Resolution: 0.589824 megapixels
154
+ - Cropped: False
155
+ - Crop style: None
156
+ - Crop aspect: None
157
+ - Used for regularisation data: No
158
+ ### ab-1024
159
+ - Repeats: 5
160
+ - Total number of images: 74
161
+ - Total number of aspect buckets: 3
162
+ - Resolution: 1.048576 megapixels
163
+ - Cropped: False
164
+ - Crop style: None
165
+ - Crop aspect: None
166
+ - Used for regularisation data: No
167
+ ### ab-1536
168
+ - Repeats: 2
169
+ - Total number of images: 73
170
+ - Total number of aspect buckets: 16
171
+ - Resolution: 2.359296 megapixels
172
+ - Cropped: False
173
+ - Crop style: None
174
+ - Crop aspect: None
175
+ - Used for regularisation data: No
176
+ ### ab-crops-512
177
+ - Repeats: 8
178
+ - Total number of images: 74
179
+ - Total number of aspect buckets: 1
180
+ - Resolution: 0.262144 megapixels
181
+ - Cropped: True
182
+ - Crop style: random
183
+ - Crop aspect: square
184
+ - Used for regularisation data: No
185
+ ### ab-1024-crop
186
+ - Repeats: 6
187
+ - Total number of images: 74
188
+ - Total number of aspect buckets: 1
189
+ - Resolution: 1.048576 megapixels
190
+ - Cropped: True
191
+ - Crop style: random
192
+ - Crop aspect: square
193
+ - Used for regularisation data: No
194
+
195
+
196
+ ## Inference
197
+
198
+
199
+ ```python
200
+ import torch
201
+ from diffusers import DiffusionPipeline
202
+ from lycoris import create_lycoris_from_weights
203
+
204
+
205
+ def download_adapter(repo_id: str):
206
+ import os
207
+ from huggingface_hub import hf_hub_download
208
+ adapter_filename = "pytorch_lora_weights.safetensors"
209
+ cache_dir = os.environ.get('HF_PATH', os.path.expanduser('~/.cache/huggingface/hub/models'))
210
+ cleaned_adapter_path = repo_id.replace("/", "_").replace("\\", "_").replace(":", "_")
211
+ path_to_adapter = os.path.join(cache_dir, cleaned_adapter_path)
212
+ path_to_adapter_file = os.path.join(path_to_adapter, adapter_filename)
213
+ os.makedirs(path_to_adapter, exist_ok=True)
214
+ hf_hub_download(
215
+ repo_id=repo_id, filename=adapter_filename, local_dir=path_to_adapter
216
+ )
217
+
218
+ return path_to_adapter_file
219
+
220
+ model_id = 'black-forest-labs/FLUX.1-dev'
221
+ adapter_repo_id = 'davidrd123/AlbertBierstadt-Flux-LoKr'
222
+ adapter_filename = 'pytorch_lora_weights.safetensors'
223
+ adapter_file_path = download_adapter(repo_id=adapter_repo_id)
224
+ pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
225
+ lora_scale = 1.0
226
+ wrapper, _ = create_lycoris_from_weights(lora_scale, adapter_file_path, pipeline.transformer)
227
+ wrapper.merge_to()
228
+
229
+ prompt = "An astronaut is riding a horse through the jungles of Thailand."
230
+
231
+
232
+ ## Optional: quantise the model to save on vram.
233
+ ## Note: The model was quantised during training, and so it is recommended to do the same during inference time.
234
+ from optimum.quanto import quantize, freeze, qint8
235
+ quantize(pipeline.transformer, weights=qint8)
236
+ freeze(pipeline.transformer)
237
+
238
+ pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
239
+ image = pipeline(
240
+ prompt=prompt,
241
+ num_inference_steps=20,
242
+ generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
243
+ width=1024,
244
+ height=1280,
245
+ guidance_scale=3.0,
246
+ ).images[0]
247
+ image.save("output.png", format="PNG")
248
+ ```
249
+
250
+
251
+
252
+ ## Exponential Moving Average (EMA)
253
+
254
+ SimpleTuner generates a safetensors variant of the EMA weights and a pt file.
255
+
256
+ The safetensors file is intended to be used for inference, and the pt file is for continuing finetuning.
257
+
258
+ The EMA model may provide a more well-rounded result, but typically will feel undertrained compared to the full model as it is a running decayed average of the model weights.
259
+