Instructions to use 24yearsold/flux2-klein-character-transfer-portable-r128 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use 24yearsold/flux2-klein-character-transfer-portable-r128 with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("black-forest-labs/FLUX.2-klein-base-9B,black-forest-labs/FLUX.2-klein-9B", dtype=torch.bfloat16, device_map="cuda") pipe.load_lora_weights("24yearsold/flux2-klein-character-transfer-portable-r128") prompt = "Turn this cat into a dog" input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png") image = pipe(image=input_image, prompt=prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
- FLUX.2-klein Character Transfer LoRA β Portable (r=128, foundation-trained)
- What it does
- Quick start (diffusers, foundation base, 28 steps β quality path)
- Quick start (diffusers, distilled base, 4 steps β speed path)
- Input stitching
- Why two bases work with one LoRA
- Usage in ComfyUI
- Training recipe (P13)
- Held-out val_loss at ck-4056 (per-Ο MSE, 8 held-out triplets)
- Reference renders
- Wandb (side-by-side, same LoRA, two bases)
- Related models from the same project
- License
- What it does
FLUX.2-klein Character Transfer LoRA β Portable (r=128, foundation-trained)
A character-transfer LoRA for FLUX.2-klein that, unusually, drives both
the un-distilled foundation (FLUX.2-klein-base-9B) and the step-distilled
inference variant (FLUX.2-klein-9B). The same file gives a quality-first
path (28 inference steps on the base) and a speed-first path (4 inference
steps on the distilled), so you can iterate fast and ship slow with one set
of weights.
What it does
Given a stitched conditioning image (source 960Γ544 on the left, reference 544Γ544 on the right), the model regenerates the 960Γ544 source half with the reference character's identity transferred onto the source's pose, composition, and background. See the input-stitching section below for the exact layout.
Quick start (diffusers, foundation base, 28 steps β quality path)
import torch
from diffusers import Flux2KleinPipeline
pipe = Flux2KleinPipeline.from_pretrained(
"black-forest-labs/FLUX.2-klein-base-9B",
torch_dtype=torch.bfloat16,
).to("cuda")
pipe.load_lora_weights(
"24yearsold/flux2-klein-character-transfer-portable-r128",
weight_name="pytorch_lora_weights.safetensors",
)
cond_image = build_stitched_conditioning(source_path, reference_path) # 1504x544 RGB
prompt = open("character_transfer_prompt.txt", encoding="utf-8").read() # ~300 CJK chars
out = pipe(
image=cond_image,
prompt=prompt,
num_inference_steps=28,
guidance_scale=3.5, # un-distilled foundation honours real CFG
).images[0]
out.save("transferred_quality.png")
Quick start (diffusers, distilled base, 4 steps β speed path)
pipe = Flux2KleinPipeline.from_pretrained(
"black-forest-labs/FLUX.2-klein-9B", # <-- distilled variant
torch_dtype=torch.bfloat16,
).to("cuda")
pipe.load_lora_weights(
"24yearsold/flux2-klein-character-transfer-portable-r128",
weight_name="pytorch_lora_weights.safetensors",
)
out = pipe(
image=cond_image,
prompt=prompt,
num_inference_steps=4, # <-- 4 steps matches the distilled recipe
guidance_scale=1.0,
).images[0]
out.save("transferred_fast.png")
You can adjust LoRA strength on either path with
pipe.set_adapters("default", adapter_weights=0.85) (anything 0.0..1.0+).
The LoRA was trained at Ξ±=1.0; 0.8β1.0 is the recommended range.
Input stitching
+--------------------+----------+
| source 960x544 | ref 544Β² | -> 1504 Γ 544 RGB
| (center-cropped) | (white- |
| | padded) |
+--------------------+----------+
Reference is PAD-to-square with white fill (composites RGBA over white if the reference has alpha), then resized to 544Γ544. Source is center-cropped (or letterboxed) to 960Γ544. Concatenate left-to-right. Output is the regenerated 960Γ544 left half.
Why two bases work with one LoRA
FLUX.2-klein-9B (distilled) and FLUX.2-klein-base-9B (foundation) share
the same transformer architecture (joint_attention_dim=12288, 8 dual-stream
- 24 single-stream blocks, attention_head_dim=128, same VAE, same Qwen3 text encoder). They differ only in the trained weight values: the distilled variant has step-and-guidance distillation baked in for 4-step inference; the foundation variant is the un-distilled weights.
This LoRA was fine-tuned on the foundation, which exposes the full gradient signal β and the resulting attention-delta turns out to attach cleanly to either weight base. We verified this with a full all-checkpoint side-by-side eval (see the W&B runs below).
Usage in ComfyUI
ComfyUI β₯ v0.10.0 (released 2026-01-20) loads this LoRA directly via the
standard Load LoRA node thanks to
Comfy-Org/ComfyUI #11981,
which added FLUX.2's to_qkv_mlp_proj and to_out key mapping to
comfy/utils.py:flux_to_diffusers. Drop the safetensors into
models/loras/ and use either the foundation or the distilled klein-9B as
your Load Checkpoint source. Strength 0.8β1.0.
Training recipe (P13)
| Base | black-forest-labs/FLUX.2-klein-base-9B (un-distilled foundation) |
| Rank | 128 (alpha=128) |
| Init | gaussian (NOT PiSSA) |
| Target modules | to_k/q/v/out.0 + to_qkv_mlp_proj + single_transformer_blocks.{0..23}.attn.to_out |
| Trainable params | ~222 M |
| Schedule | cosine to 0.25Γ peak, 4 056 steps over 6 epochs |
| LR | 2 Γ 10β»β΄ (warmup 50, num_cycles=0.3333) |
| Effective batch | 3 (3 GPU Γ bsz 1 Γ no grad-accum) |
| Precision | bf16 |
| Weighting scheme | none (uniform Ο-sampling; NOT logit_normal) |
| Hardware | 3Γ RTX 4090 24 GB |
| Wall clock | ~5.5 h training + ~1.5 h eval/diagnose |
Held-out val_loss at ck-4056 (per-Ο MSE, 8 held-out triplets)
| u | loss |
|---|---|
| 0.10 (texture) | 0.3252 |
| 0.25 | 0.2294 |
| 0.50 | 0.2484 |
| 0.75 | 0.3591 |
| 0.90 (struct) | 0.5480 |
| mean | 0.3420 |
Trajectory was still descending at endpoint, so an 8-epoch follow-up is the natural next move if you want a slightly better LoRA.
Reference renders
samples/p13_lora_on_distilled_ablation.png β 8 triplets Γ 4 columns:
foundation-no-LoRA, foundation+LoRA (28 steps), distilled-no-LoRA (4 steps),
distilled+LoRA (4 steps). The two no-LoRA columns confirm neither base does
character transfer on its own; the LoRA does all the work and the
quality/speed gap between the two LoRA columns is modest.
samples/triplet_0..7_foundation_28steps.png β finals on the foundation base
samples/triplet_0..7_distilled_4steps.png β finals on the distilled base
Wandb (side-by-side, same LoRA, two bases)
- Foundation @ 28 steps, all 24 ckpts Γ 8 triplets: https://wandb.ai/team-rocket/flux-klein-character-transfer/runs/jj5qaxrn
- Distilled @ 4 steps, all 24 ckpts Γ 8 triplets: https://wandb.ai/team-rocket/flux-klein-character-transfer/runs/78ynegnw
- Direct 4-col ablation (the headline image): https://wandb.ai/team-rocket/flux-klein-character-transfer/runs/tql9zwhd
Related models from the same project
24yearsold/flux2-klein-9b-character-transfer-lora-r128(P12) β earlier LoRA trained against the distilled klein-9B. Lower MSE on the held-out triplets but tied to the distilled base; ships its own inference recipe.24yearsold/flux2-klein-9b-character-transfer-pissa-r64(P7) β original PiSSA r=64 merged release.
This portable r=128 release supersedes both for new work.
License
Apache 2.0. The base models
(black-forest-labs/FLUX.2-klein-base-9B, black-forest-labs/FLUX.2-klein-9B)
have their own licenses; please follow theirs when using.
- Downloads last month
- 29
Model tree for 24yearsold/flux2-klein-character-transfer-portable-r128
Base model
black-forest-labs/FLUX.2-klein-9B