Text-to-Image
Diffusers
sd3
sd3-diffusers
simpletuner
Not-For-All-Audiences
lora
template:sd-lora
standard
license: other | |
base_model: "stabilityai/stable-diffusion-3-medium-diffusers" | |
tags: | |
- sd3 | |
- sd3-diffusers | |
- text-to-image | |
- diffusers | |
- simpletuner | |
- not-for-all-audiences | |
- lora | |
- template:sd-lora | |
- standard | |
inference: true | |
# simpletuner-lora | |
This is a standard PEFT LoRA derived from [stabilityai/stable-diffusion-3-medium-diffusers](https://huggingface.co/stabilityai/stable-diffusion-3-medium-diffusers). | |
The main validation prompt used during training was: | |
``` | |
A realistic food photo of @cutsom-dish | |
``` | |
## Validation settings | |
- CFG: `3.0` | |
- CFG Rescale: `0.0` | |
- Steps: `20` | |
- Sampler: `None` | |
- Seed: `42` | |
- Resolution: `1024x1024` | |
Note: The validation settings are not necessarily the same as the [training settings](#training-settings). | |
<Gallery /> | |
The text encoder **was not** trained. | |
You may reuse the base model text encoder for inference. | |
## Training settings | |
- Training epochs: 14 | |
- Training steps: 30 | |
- Learning rate: 0.0001 | |
- Effective batch size: 1 | |
- Micro-batch size: 1 | |
- Gradient accumulation steps: 1 | |
- Number of GPUs: 1 | |
- Prediction type: flow-matching | |
- Rescaled betas zero SNR: False | |
- Optimizer: adamw_bf16 | |
- Precision: Pure BF16 | |
- Quantised: No | |
- Xformers: Not used | |
- LoRA Rank: 16 | |
- LoRA Alpha: None | |
- LoRA Dropout: 0.1 | |
- LoRA initialisation style: default | |
## Datasets | |
### food_images | |
- Repeats: 0 | |
- Total number of images: 2 | |
- Total number of aspect buckets: 1 | |
- Resolution: 1.0 megapixels | |
- Cropped: True | |
- Crop style: center | |
- Crop aspect: square | |
## Inference | |
```python | |
import torch | |
from diffusers import DiffusionPipeline | |
model_id = 'stabilityai/stable-diffusion-3-medium-diffusers' | |
adapter_id = 'baldesco/simpletuner-lora' | |
pipeline = DiffusionPipeline.from_pretrained(model_id) | |
pipeline.load_lora_weights(adapter_id) | |
prompt = "A realistic food photo of @cutsom-dish" | |
negative_prompt = 'blurry, cropped, ugly' | |
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') | |
image = pipeline( | |
prompt=prompt, | |
negative_prompt=negative_prompt, | |
num_inference_steps=20, | |
generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(1641421826), | |
width=1024, | |
height=1024, | |
guidance_scale=3.0, | |
).images[0] | |
image.save("output.png", format="PNG") | |
``` | |