Abruptly stopping with <tool call>

#13

by crotron - opened 23 days ago

I'm running Qwen3.6 27B with llama.cpp, pi mono and the v8 chat template. It works really well most of the time, but occasionally it just stops with it apparently trying and failing to do a tool call. I'm honestly not sure what could be the cause. Maybe it's just a quirk of the model. Has anyone else been seeing this happen?

0x4tomic

22 days ago

See discussion #1 as linked - this is definitely a quirk of Qwen3.6 so far but the best speculation the people using these modified templates have come up with is:

Model Quantization can impact the likelihood of this happening
The template may tend to fail more in chained tool calls, i.e. <tool_call> edit<x></tool_call>\nLet me also edit .\n...`
Depending on whether you're worrying about cache invalidation ({%- set preserve_thinking = true %} at the top of your template), the previous reasoning done by the model can teach the model the bad pattern of a "thinking" tool call.
From AI review, the template teaches the model to always start its turn with <think>, even if the model may have otherwise not been thinking.

These are our leading hypotheses, and all of those don't even take into account the harness or runtime you might be using. Hence why some templates may work better or more stable in different contexts.

froggeric

Owner 21 days ago

Please try with v15. I think I have finally managed to fix the overthinking/indecision, and the repeated loops on tool errors among other things. So far, in my testing it is holding up. I am hoping this will be the final version.

froggeric changed discussion status to closed 20 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment