Ugly Kontext — a FLUX.2 Klein 4B image-edit LoRA

A small image-edit LoRA trained on top of FLUX.2-klein-base-4B that transforms photos of cats, dogs, and small animal groups into deliberately crude "ugly sketch" versions. Trained on 120 paired (input photo, output sketch) examples using ai-toolkit.

Quick start

import torch
from PIL import Image
from diffusers import Flux2KleinPipeline

pipe = Flux2KleinPipeline.from_pretrained(
    "black-forest-labs/FLUX.2-klein-4B",
    torch_dtype=torch.bfloat16,
).to("cuda")

pipe.load_lora_weights(
    "stephenbtl/ugly-kontext-klein-4b-lora",
    weight_name="ugly_kontext_klein_4b_v1.safetensors",
    adapter_name="ugly",
)
pipe.set_adapters(["ugly"], adapter_weights=[1.0])

reference = Image.open("your_pet.jpg").convert("RGB").resize((1024, 1024))
result = pipe(
    prompt="change the photo the cat into an ugly sketch of the same cat",
    image=reference,
    height=1024, width=1024,
    num_inference_steps=4,
    guidance_scale=4.0,
    generator=torch.Generator("cuda").manual_seed(0),
).images[0]
result.save("ugly.jpg")

The LoRA is trained on the base model but runs on the distilled FLUX.2-klein-4B (4 steps) as shown above. Applying the LoRA on the distilled model is faster and typically gives better results than the base model.

Klein expects width and height divisible by 16, with (W·H)/256 ≤ 4096.

Edit LoRAs in one paragraph

A text-to-image LoRA teaches the model an adjective — a style or concept — from (image, description) pairs and renders from prompt alone. An image-edit LoRA teaches the model a verb — a transformation — from (input image, output image, edit instruction) triples; at inference you pass both an image and a prompt, and the model applies the trained transformation while preserving the input's identity. Klein supports both modes in the same pipeline: Flux2KleinPipeline is text-to-image when you don't pass an image=, and image-edit when you do.

Repo contents

Path	What it is
`ugly_kontext_klein_4b_v1.safetensors`	Final LoRA — step 2000.
`checkpoints/*.safetensors`	Intermediate checkpoints at 250-step intervals (250 → 1750). Useful for picking a sweet spot before over-training.
`samples/`	27 training-time samples — three prompts (cat / dog / animal group) at every checkpoint. Pure text-to-image (no control image), so they show style progression rather than edit behaviour.

Training details


Base model	`black-forest-labs/FLUX.2-klein-base-4B` (Apache 2.0)
Trainer	ostris/ai-toolkit, `arch: flux2_klein_4b`
Dataset	120 paired `(reference, target, caption)` examples — cats, dogs, small animal groups
Network	LoRA, `linear: 128`, `linear_alpha: 64`, `conv: 64`, `conv_alpha: 32`
Optimiser	`adamw8bit`, `lr: 1e-4`, `weight_decay: 1.5e-4`
Scheduler	flow-match, `timestep_type: shift`, `content_or_style: balanced`
Resolution	512² (matches dataset native resolution)
Steps	2000, `batch_size: 1`, gradient checkpointing
Hardware	RTX 4090 (24 GB), `quantize: true`
Wall time	~35 min

Tips

Match the dataset's prompt shape. Captions follow change the photo the X into an ugly sketch of the same X. Sticking close to that wording produces the most reliable activation; very different phrasings work less well because the dataset has only 4 unique caption strings.
LoRA scale 0.7–1.0 is the sweet spot. Below 0.5 you keep more of Klein-base's polish; above 1.0 the transformation dominates and you start to amplify dataset quirks (extra eyes, etc.).
Try the step-1500 checkpoint before reaching for the final. Edit LoRAs over-train fast on small datasets, and earlier checkpoints often preserve subject identity better.

Limitations

Caption monotony. Only 4 unique caption strings across 120 examples. The model learns the transformation well, but generalisation to differently-worded prompts is limited.
Subject distribution. Cats, dogs, and small animal groups dominate the training data. People, buildings, food, and other subjects work to varying degrees but show identity drift more often.
Aesthetic. The "ugly" look is intentional and quirky — it's not a polished sketch style.
Licence inheritance. The LoRA inherits Apache 2.0 from the base model.

Licence

Apache 2.0 (inherits from FLUX.2-klein-base-4B).

Downloads last month: 1

Model tree for stephenbtl/ugly-kontext-klein-4b-lora

Base model

black-forest-labs/FLUX.2-klein-base-4B

Adapter

(49)

this model

Article mentioning stephenbtl/ugly-kontext-klein-4b-lora

Fine-tune FLUX.2 [klein] with a LoRA under 60 minutes

black-forest-labs

•

2 days ago

• 14