Qwen3-4B-Base-Thinking-Preservation

Qwen/Qwen3-4B-Base weights with the Qwen3-4B-Thinking-Preservation chat template (thinking always preserved, no nonthinking mode), so the base model can be SFT-trained / evaluated with the same multi-turn thinking contract.

Thinking is always preserved across multi-turn history (append-only). Every assistant turn keeps its <think>...</think> reasoning, not just the latest one, and the generation prompt always opens <think> (passing enable_thinking=False has no effect). This makes multi-turn agent training match evaluation — the model always sees its own prior reasoning. Model weights are identical to Qwen/Qwen3-4B-Base; only the chat template differs.

Downloads last month
11
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for eewer/Qwen3-4B-Base-Thinking-Preservation

Finetuned
(332)
this model

Collection including eewer/Qwen3-4B-Base-Thinking-Preservation