Keeps emitting <tool_call> json into the context

#23
by theliphant - opened

Ever since updating to merged 3.5/6 version randomly stops and emits json for tool_call
Using unsloth/Qwen3.6-27B-MTP-GGUF:Q5_K_XL in llama-server with q8 k/v cache.


 <tool_call>
 <function=bash>
 <parameter=command>
 find Library/PackageCache -name "UniversalResourceData.cs" 2>/dev/null
 </parameter>
 </function>
 </tool_call>

Although this is in pi so I wonder if something broke there 🙃

Looks like it believed it was a thinking block

{
  "type": "message",
  "id": "ccbda428",
  "parentId": "3318fa1e",
  "timestamp": "2026-05-18T16:32:49.150Z",
  "message": {
    "role": "assistant",
    "content": [
      {
        "type": "thinking",
        "thinking": "<tool_call>\n<function=bash>\n<parameter=command>\nfind Library/PackageCache -name \"UniversalResourceData.cs\" 2>/dev/null\n</parameter>\n</function>\n</tool_call>",
        "thinkingSignature": "reasoning_content"
      }
    ],
    "api": "openai-completions",
    "provider": "llama-cpp-mac",
    "model": "Qwen3.6-27B",
    "usage": {
      "input": 671,
      "output": 43,
      "cacheRead": 58402,
      "cacheWrite": 0,
      "totalTokens": 59116,
      "cost": {
        "input": 0,
        "output": 0,
        "cacheRead": 0,
        "cacheWrite": 0,
        "total": 0
      }
    },
    "stopReason": "stop",
    "timestamp": 1779121949860,
    "responseId": "chatcmpl-xVLlDQ1nqLjtHoIgiuaBwOcwrNYRto06"
  }
}

Hi @theliphant , if you want you give my just released template rebuild project a try.

https://huggingface.co/StableQuant/Qwen-Templates-Rebuild-Project/

If v1.0 doesnt fixes it for you please tell me. If it works tell me too :-)

It will then be added into the research pipeline

Regards

Sign up or log in to comment