kompress-v6

Token compression classifier fine-tuned from PeetPedro/kompress-v4 (ModernBERT-base, 149M params). Trained as part of the ultrawhale fine-tuning loop.

Kompress classifies each token in a message as keep (1) or drop (0). Used by the headroom proxy to compress LLM context before it reaches the model.

Eval results (heretic adversarial benchmark)

Heretic-style prompts generate responses maximally dense with must-keep tokens (chemical formulas, CVE identifiers, memory addresses, line numbers). The benchmark measures what fraction of those tokens survive compression.

Metric Value
heretic exact_pct 0.962
keep_rate 0.854
override_delta 0.000
base model kompress-v4

Full progression across all versions

Training

3,000 synthetic Claude Code agent-pattern pairs (bash_output, file_read, error_trace, search_result, json_tool_result) merged with 2,003 generic pairs. Fine-tuned from v4. Self-labeling skipped โ€” v4 subword tokenizer degrades agent references (mk_in_ref collapsed to 0.652). Generator references (mk_in_ref=1.0) used directly. Model became more conservative (keep_rate 0.854 vs v4's 0.823).

Usage

# Via headroom proxy (recommended)
# ANTHROPIC_BASE_URL=http://localhost:8787 claude

# Direct library use
from headroom import compress, CompressConfig
result = compress(messages, config=CompressConfig(kompress_model="PeetPedro/kompress-v6"))

Series

Version heretic keep_rate Notes
v3 0.942 0.728 first self-label
v3.1 0.925 โ€” domain data
v3.2 0.929 โ€” domain refined
v3.3 0.942 โ€” domain-only, overfit
v4 0.967 0.823 override internalized
v5 0.961 โ€” loop converged
v6 0.962 0.854 agent-distribution

Training code: ultrawhale

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for PeetPedro/kompress-v6

Quantized
(38)
this model