Instructions to use dataautogpt3/Krea2-weights-experiments with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use dataautogpt3/Krea2-weights-experiments with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("dataautogpt3/Krea2-weights-experiments", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
- DiffusionBee
Krea 2 Turbo β Hand-Edited Weight Experiments
Overview
This repository contains weight-edited variants of the Krea 2 Turbo diffusion model. Each variant was created by surgically scaling specific transformer block weights in the 12.8B parameter single-stream MMDiT, producing artistic and functional model variations without any retraining.
These are research artifacts from hand-editing diffusion model weights using the methodology described below. The base models (Krea 2 Turbo and Krea 2 Raw) are NOT included β only the edited variants.
Method
All variants use the core formula:
theta_new = theta_original * (1 - 2 * alpha)
Where alpha controls the inversion strength:
alpha=0.05β scale 0.90 (subtle)alpha=0.10β scale 0.80 (artistic sweet spot)alpha=0.15β scale 0.70 (strong)alpha=0.20β scale 0.60 (aggressive but functional)
Full negation (alpha=0.5, scale=-1.0) breaks the model and is excluded from this repository.
Architecture: Krea 2 Turbo
- Type: Single-stream MMDiT (Diffusion Transformer)
- Parameters: 12.8B
- File size: ~25GB per variant (BF16 + F32 tensors)
- Structure: 28 uniform transformer blocks
- Block sub-layers:
blocks.N.attn.*(7 tensors): gate, qknorm, wq, wk, wv, woblocks.N.mlp.*(3 tensors): gate, up, down (SwiGLU)blocks.N.mod.lin(1 tensor): conditioning modulationblocks.N.prenorm.scale/blocks.N.postnorm.scale
Variants
B1 β Partial Inversion (Most Artistic)
| Property | Value |
|---|---|
| File | Krea_2_turbo_inv_B1_partial10.safetensors |
| Blocks | 12-14 (mid) |
| Layers | ALL (39 tensors per block group) |
| Alpha | 0.10 (scale=0.80) |
| Result | Most artistic variant β strong style/content shift while remaining coherent |
B3 β Attention-Only Partial Inversion
| Property | Value |
|---|---|
| File | Krea_2_turbo_inv_B3_attn_p10.safetensors |
| Blocks | 12-14 (mid) |
| Layers | attn only (21 tensors) |
| Alpha | 0.10 (scale=0.80) |
| Result | Functional, subtler than B1 β attention-specific perturbation |
D β Gate Scaling (All Blocks)
| Property | Value |
|---|---|
| File | Krea_2_turbo_inv_D_gate_p20.safetensors |
| Blocks | 0-27 (all) |
| Layers | attn.gate only (28 tensors) |
| Alpha | 0.20 (scale=0.60) |
| Result | Functional, moderate effect β gate weights are more tolerant of aggressive scaling |
F β Early/Late Block Inversion
| Property | Value |
|---|---|
| File | Krea_2_turbo_F_early_a10.safetensors |
| Blocks | 0-2 (early) |
| Layers | ALL |
| Alpha | 0.10 (scale=0.80) |
| Result | Affects structure, composition, spatial layout |
| Property | Value |
|---|---|
| File | Krea_2_turbo_F_late_a10.safetensors |
| Blocks | 25-27 (late) |
| Layers | ALL |
| Alpha | 0.10 (scale=0.80) |
| Result | Affects style, color, detail, texture refinement |
G β Mid-Block Alpha Sweep
Three variants at different inversion strengths on the same block zone:
| File | Alpha | Scale | Notes |
|---|---|---|---|
Krea_2_turbo_G_mid_a05.safetensors |
0.05 | 0.90 | Subtle |
Krea_2_turbo_G_mid_a15.safetensors |
0.15 | 0.70 | Strong |
Krea_2_turbo_G_mid_a20.safetensors |
0.20 | 0.60 | Aggressive but functional |
All target blocks 12-14, ALL layers.
H β Layer-Selective Mid-Block
| File | Blocks | Layers | Alpha |
|---|---|---|---|
Krea_2_turbo_H_mid_attn_a10.safetensors |
12-14 | attn only | 0.10 |
Krea_2_turbo_H_mid_mlp_a10.safetensors |
12-14 | mlp only | 0.10 |
Isolates the effect of attention vs MLP perturbation on the same block zone.
I β Gradient Alpha
| Property | Value |
|---|---|
| File | Krea_2_turbo_I_gradient.safetensors |
| Blocks | 0-27 (all) |
| Layers | ALL |
| Alpha | 0.03 β 0.17 (gradient across blocks) |
| Scale | 0.94 β 0.66 |
| Result | Smooth global perturbation β early blocks barely touched, late blocks aggressively inverted |
Excluded Variants (Broken)
The following variants were created but are broken (model produces noise/garbage) and are NOT included:
| Variant | What was done | Why it broke |
|---|---|---|
| B2_attn_full | attn weights * -1.0 | Full negation destroys attention computation |
| D_wv_all | wv weights * -1.0 | Full negation of value projection |
| E_ties_mid | TIES-style sign flip on mid blocks | Full negation variant |
Usage
ComfyUI
- Place
.safetensorsfiles inComfyUI/models/diffusion_models/ - Load via
UNETLoadernode - Use the same VAE, CLIP, and text encoder as Krea 2 Turbo
- Generate with your standard Krea 2 workflow
Diffusers
from diffusers import DiffusionPipeline
import torch
pipe = DiffusionPipeline.from_pretrained(
"dataautogpt3/Krea2-weights-experiments",
torch_dtype=torch.bfloat16,
variant="bf16"
).to("cuda")
Note: These are diffusion model weights only. You need the corresponding VAE, text encoders, and tokenizer from the original Krea 2 Turbo release.
Key Findings
Scaling works, full negation breaks. Partial inversion (scale 0.60-0.90) produces functional, artistic variants. Full negation (scale=-1.0) breaks the model.
10% inversion is the sweet spot. Alpha=0.10 (scale=0.80) on mid blocks 12-14 produces the most artistically interesting results.
Mid blocks are safest to modify. Blocks 12-14 are the most redundant and tolerate perturbation best.
Gate weights are most tolerant. Attention gate weights can be scaled to 0.60 across all blocks while remaining functional β other layers break sooner.
The artistic effects come from compensation. Partial perturbation triggers creative reorganization in unedited blocks β the compensatory masquerade effect.
Research Context
This work draws on findings from:
- Task Arithmetic (Ilharco et al., ICLR 2023) β formal basis for weight negation
- weights2weights (NeurIPS 2024) β diffusion weight space as meta-latent
- Unraveling MMDiT Blocks (2025) β per-block role mapping for MMDiT
- C3: Creative Concept Catalyst (CVPR 2025) β low-frequency amplification in shallow blocks
- ConceptPrune (ICLR 2025) β tiny weight changes shift semantic output
Credits
- Base model: Krea 2 Turbo (Krea AI)
- Weight editing: DataPlusEngine
- Methodology: Hand-editing diffusion weights via mmap-based surgical tensor scaling
- Downloads last month
- -
