kompress-v13 β€” GLM Agent Scenarios + Regex Teacher (experimental)

Training data generated by GLM-5.1-FP8 simulating realistic multi-turn coding sessions across Rust, Python, TypeScript, and Go. Labels: deterministic must-keep regex (same as production safety net). Experimental β€” kompress-v8 is the production recommendation.

Results

Metric v8 (domain data) v13 (GLM scenarios)
heretic exact (32p) 0.955 0.951
keep_rate 0.854 0.951
override_delta 0.000 0.000

Finding: GLM-generated scenarios are realistic but the regex teacher labels too many tokens as must-keep (paths, identifiers, numbers all match in code). The regex-as-teacher approach produces conservative models.

Training

127 GLM-generated tool outputs + 254 generic (33% GLM ratio). 3 epochs from v2-base.

Usage

from headroom import compress, CompressConfig
result = compress(messages, config=CompressConfig(kompress_model="PeetPedro/kompress-v13"))

CONCLUSION

Regex teacher produces conservative models. 0.951 heretic but keep_rate 0.951.

USECASE

Demonstrates regex-as-teacher limitation. Educational value.

Series

Version Teacher Heretic Status
v2-base β€” 0.975 ceiling
v4 self-labels 0.943 β€”
v8 Qwen2.5-7B 0.955 production
v13 regex 0.951 too conservative

Full benchmark β†’ | Training repo β†’ | Headroom β†’ | vaked.dev β†’

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for PeetPedro/kompress-v13

Quantized
(9)
this model