Qwen3.5-9B (+ generation_config.json)
Verbatim copy of Qwen/Qwen3.5-9B with the only change being the addition of a
generation_config.json carrying both stop ids:
{"eos_token_id": [248046, 248044]} // <|im_end|>, <|endoftext|>
Upstream Qwen/Qwen3.5-9B ships no generation_config.json, so inference engines
(sglang/vLLM) fall back to config.json's eos = <|endoftext|> (248044) and never
stop on the chat turn terminator <|im_end|> (248046) — causing runaway generation
in multi-turn / tool-use rollouts. The 35B-A3B checkpoint ships this file; 4B/9B do not.
This repo restores it so multi-turn generation halts correctly at each turn.
- Downloads last month
- 300
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support