GLM-5.2-504B-K — knowledge-augmented REAP keep-168 (full-data Router-KD, NVFP4)

The "K-cut" sibling of 0xSero/GLM-5.2-504B: the same 504B / keep-168 budget, but the expert selection is biased toward knowledge & reasoning — the winning top-160 core (kept bit-for-bit) plus the 8 highest-priority knowledge-exclusive experts per layer that coding-saliency pruning drops. Recovered with gate-only **Router-KD trained on the FULL calibration set (18.6k real traces)** — 6x the data of the first-pass cuts.

Sponsor

8x NVIDIA B200 sponsored by Lambda. Thank you.

Why this variant exists

REAP saliency computed from coding traces under-weights experts that fire mainly on reasoning/knowledge. The K-cut deliberately re-includes them — trading a sliver of coding-saliency coverage for broader knowledge coverage. Reach for this on knowledge/reasoning-heavy workloads; use the plain GLM-5.2-504B otherwise.

Eval (n=2000 held-out real prompts, raw, no max_tokens / no timeout)

metric	GLM-5.2-504B-K (this)	GLM-5.2-504B (plain floor)
attractor / loop rate	0.078	0.072
natural-EOS rate	0.923	0.928
distinct-4	0.881	0.880
median tokens	1232	1267

Serving (vLLM)

vllm serve 0xSero/GLM-5.2-504B-K --tensor-parallel-size 8 \
  --quantization modelopt_fp4 --kv-cache-dtype fp8 --trust-remote-code --max-model-len 262144

REAP knowledge-augmented cut + full-data Router-KD. Compute sponsored by Lambda.

Honest note (n=2000)

The unpruned teacher loops on only 3.6% of these prompts vs ~7-8% for this pruned cut — REAP pruning roughly doubles the loop rate, and gate-only Router-KD (even on full data) does not close it. Earlier small-n evals suggesting parity were a sampling fluke. A knowledge-recovery LoRA is in progress to add capacity back.

Downloads last month: 126

Safetensors

Model size

292B params

Tensor type

BF16

F32

F8_E4M3

Model tree for 0xSero/GLM-5.2-504B-K

Base model

zai-org/GLM-5.2

Quantized

(80)

this model

Collection including 0xSero/GLM-5.2-504B-K

GLM — REAP

Collection

REAP-pruned & quantized GLM-4.x / 5 / 5.1 (+ Flash fine-tunes). • 25 items • Updated 5 days ago • 1