kompress-v8 — C3 Self-Distillation (production)

Token compression classifier via C3 self-distillation: Qwen2.5-7B-Instruct teacher labels on real agent tool outputs, fine-tuned from kompress-v2-base. Production recommendation — the best fine-tuned kompress model.

Used by headroom to compress LLM context. Trained in the ultrawhale loop. Benchmarked on heretic adversarial eval.

Results

Metric	v2-base	v4	v8
heretic exact (32p)	0.975	0.943	0.955
keep_rate	0.897	0.823	0.854
override_delta	—	0.000	0.000
agent mk_in_ref (with override)	—	0.962	1.000
compression	10%	18%	15%

v8 trades 2% precision for 50% more compression vs v2-base. With the production must-keep override (headroom PR #1419), agent tool output survival is perfect (1.000).

Training

97 Qwen2.5-7B labeled pairs + 200 generic multi-domain (33% C3 ratio). 3 epochs from v2-base on RTX 4090. Loss 0.490 → 0.431. Key insight: Qwen teacher labels beat self-labels by +0.012 heretic.

Usage

from headroom import compress, CompressConfig
result = compress(messages, config=CompressConfig(kompress_model="PeetPedro/kompress-v8"))

Or via env: HEADROOM_KOMPRESS_MODEL=PeetPedro/kompress-v8

Complete Series

Version	Teacher	Data	Heretic	Keep	Status
v2-base	—	—	0.975	0.897	precision ceiling
v3	self-label	Q&A	0.942	0.728	first self-label
v4	self-label	domain	0.943	0.823	override internalized
v5	self-label	domain	0.961	—	converged
v6	generator	agent-dist	0.962	0.854	dead end
v7	sliding-window	agent	0.956	0.868	dead end
v8	Qwen2.5-7B	C3+generic	0.955	0.854	← use this
v9	Qwen2.5-7B	C3-only	0.921	—	overfit
v10	Qwen2.5-7B	scaled C3	0.947	0.891	diminishing
v11	Qwen2.5-7B	large enc	0.917	0.517	capacity ≠ precision
v12	Qwen3-Coder	C3+generic	0.949	0.949	too conservative
v13	regex	GLM scenarios	0.951	0.951	too conservative
v14	council	v8+GLM	0.882	—	proof-of-concept

All models: PeetPedro on HuggingFace

CONCLUSION

Production model. 0.955 heretic, 15% compression, 1.000 agent mk_in_ref with override. Pareto-optimal at λ=3.0.

USECASE

Production. Use for headroom proxy compression. Best balance of precision and compression.

Full benchmark → | Training repo → | Headroom → | vaked.dev →

Citation

If you use kompress-v8, please cite:

@software{lodri2026kompress, author = {Peter Lodri}, title = {Asymmetric Loss Modulation Resolves the Voting Ensemble Paradox in Learned Context-Pruning Ensembles}, url = {https://github.com/peterlodri-sec/longrun-eval-kompress}, year = {2026}, license = {Apache-2.0} }