kompress_zh baseline v1 LoRA
Chinese plain-text compression for agent-grade context, not generic summarization.
kompress_zh is a deployment-oriented Chinese compression model line. It compresses long working text into denser, shorter text blocks while preserving meaning, execution anchors, and downstream readability.
What It Is
This release is built for:
- task instructions
- execution rules
- project docs
- sync notes
- result explanations
- long natural-language-heavy text containing paths, filenames, commands, URLs, and numbers
It is not built for:
- raw code
- raw JSON / YAML / XML
- logs or stack traces
- diffs or patches
- free-form summarization
Baseline Snapshot
- base model:
Qwen/Qwen3.5-0.8B - release type:
LoRA adapter only - training method:
Swift + LoRA - inference mode:
language-model-only - dataset:
standardset_v6_1234 - checkpoint source:
reference_v1 / checkpoint-61 - evaluation date:
2026-06-10
Headline numbers:
25.7%average reduction on the evaluated test split92.2%strict anchor retention99.1%anchor-bearing data in the baseline dataset- evaluated on
132test samples
Why It Matters
Most compression models can make text shorter. Fewer can keep the parts that matter in agent workflows:
- file paths
- commands and parameters
- model ids and numbers
- execution rules
- next-step hints and delivery constraints
kompress_zh is built around that narrower, harder problem.
Design Principles
- Compress Chinese plain text, not raw code or structured blobs.
- Use light structure plus light wenyan compression feel.
- Treat anchors as high-value content, not noise.
- Trust case-level review over automatic scores alone.
The output target is compressed working text, not a generic summary artifact.
How To Use
This release assumes a base-model-plus-adapter setup:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = "Qwen/Qwen3.5-0.8B"
adapter_path = "deserveall/kompress_zh-baseline-v1-lora"
tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
base_model,
trust_remote_code=True,
device_map="auto",
torch_dtype="auto",
)
model = PeftModel.from_pretrained(model, adapter_path)
model.eval()
source_text = """当前统一基线评测入口是 `scripts/eval_compare_full.py`。
未经统一确认,不要私自修改以下核心口径:
- `max_tokens`
- baseline / finetuned 对照方式
- `DeepSeek` 复核模式"""
prompt = f"""请将下面这段中文文本压缩改写为更短版本。要求:保留核心语义;尽量保留路径、命令、文件名、数字等关键锚点;允许轻结构化;允许轻文言压缩感;不要编造新信息。
<原文>
{source_text}
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=192)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Notes:
- This baseline is
language-model-only. - Do not use vision inputs.
- The model is designed for compression, not free-form creative rewriting.
Evaluation Context
Baseline v1 refers to the 2026-06-10 offline snapshot:
- train / val / test:
973 / 129 / 132 - test split size:
132 - avg prediction ratio:
0.7425 - avg reduction:
25.7% - avg char F1:
0.8039 - avg strict anchor retention:
0.9216 - avg soft anchor retention:
0.8075
These numbers describe a strong first baseline for anchor-heavy Chinese agent text, not a final universal benchmark.
Release Scope
- this adapter is intended to be public
- the full training dataset is not promised as fully open
- public release focuses on task definition, evaluation framing, examples, and usable adapter weights
This conservative stance exists because the source pool mixes multiple real-world workflow-style inputs, and not every upstream source should be treated as fully redistributable.
Files In This Repo
README.mdadapter_config.jsonadapter_model.safetensorsLICENSE
Limitations
- compression is still conservative on many samples
- link-heavy material is still harder than desired
- some cases may under-compress where a stronger model could go further
- this is a baseline release, not the final strongest version