Ugly Kontext β€” a FLUX.2 Klein 4B image-edit LoRA

A small image-edit LoRA trained on top of FLUX.2-klein-base-4B that transforms photos of cats, dogs, and small animal groups into deliberately crude "ugly sketch" versions. Trained on 120 paired (input photo, output sketch) examples using ai-toolkit.

Quick start

import torch
from PIL import Image
from diffusers import Flux2KleinPipeline

pipe = Flux2KleinPipeline.from_pretrained(
    "black-forest-labs/FLUX.2-klein-4B",
    torch_dtype=torch.bfloat16,
).to("cuda")

pipe.load_lora_weights(
    "stephenbtl/ugly-kontext-klein-4b-lora",
    weight_name="ugly_kontext_klein_4b_v1.safetensors",
    adapter_name="ugly",
)
pipe.set_adapters(["ugly"], adapter_weights=[1.0])

reference = Image.open("your_pet.jpg").convert("RGB").resize((1024, 1024))
result = pipe(
    prompt="change the photo the cat into an ugly sketch of the same cat",
    image=reference,
    height=1024, width=1024,
    num_inference_steps=4,
    guidance_scale=4.0,
    generator=torch.Generator("cuda").manual_seed(0),
).images[0]
result.save("ugly.jpg")

The LoRA is trained on the base model but runs on the distilled FLUX.2-klein-4B (4 steps) as shown above. Applying the LoRA on the distilled model is faster and typically gives better results than the base model.

Klein expects width and height divisible by 16, with (WΒ·H)/256 ≀ 4096.

Edit LoRAs in one paragraph

A text-to-image LoRA teaches the model an adjective β€” a style or concept β€” from (image, description) pairs and renders from prompt alone. An image-edit LoRA teaches the model a verb β€” a transformation β€” from (input image, output image, edit instruction) triples; at inference you pass both an image and a prompt, and the model applies the trained transformation while preserving the input's identity. Klein supports both modes in the same pipeline: Flux2KleinPipeline is text-to-image when you don't pass an image=, and image-edit when you do.

Repo contents

Path What it is
ugly_kontext_klein_4b_v1.safetensors Final LoRA β€” step 2000.
checkpoints/*.safetensors Intermediate checkpoints at 250-step intervals (250 β†’ 1750). Useful for picking a sweet spot before over-training.
samples/ 27 training-time samples β€” three prompts (cat / dog / animal group) at every checkpoint. Pure text-to-image (no control image), so they show style progression rather than edit behaviour.

Training details

Base model black-forest-labs/FLUX.2-klein-base-4B (Apache 2.0)
Trainer ostris/ai-toolkit, arch: flux2_klein_4b
Dataset 120 paired (reference, target, caption) examples β€” cats, dogs, small animal groups
Network LoRA, linear: 128, linear_alpha: 64, conv: 64, conv_alpha: 32
Optimiser adamw8bit, lr: 1e-4, weight_decay: 1.5e-4
Scheduler flow-match, timestep_type: shift, content_or_style: balanced
Resolution 512Β² (matches dataset native resolution)
Steps 2000, batch_size: 1, gradient checkpointing
Hardware RTX 4090 (24 GB), quantize: true
Wall time ~35 min

Tips

  • Match the dataset's prompt shape. Captions follow change the photo the X into an ugly sketch of the same X. Sticking close to that wording produces the most reliable activation; very different phrasings work less well because the dataset has only 4 unique caption strings.
  • LoRA scale 0.7–1.0 is the sweet spot. Below 0.5 you keep more of Klein-base's polish; above 1.0 the transformation dominates and you start to amplify dataset quirks (extra eyes, etc.).
  • Try the step-1500 checkpoint before reaching for the final. Edit LoRAs over-train fast on small datasets, and earlier checkpoints often preserve subject identity better.

Limitations

  • Caption monotony. Only 4 unique caption strings across 120 examples. The model learns the transformation well, but generalisation to differently-worded prompts is limited.
  • Subject distribution. Cats, dogs, and small animal groups dominate the training data. People, buildings, food, and other subjects work to varying degrees but show identity drift more often.
  • Aesthetic. The "ugly" look is intentional and quirky β€” it's not a polished sketch style.
  • Licence inheritance. The LoRA inherits Apache 2.0 from the base model.

Licence

Apache 2.0 (inherits from FLUX.2-klein-base-4B).

Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for stephenbtl/ugly-kontext-klein-4b-lora

Adapter
(49)
this model

Article mentioning stephenbtl/ugly-kontext-klein-4b-lora