v19 released with major improvements

#22

by froggeric - opened 3 days ago

I think I have finally solved the frequent stops in v19. So far it has been flawless in 3 long agentic tests in a row. Previously, I had it happen in around 80% of my sessions.

This has been a tough one to crack. To fix it I had to resort to better prompt engineering:

<IMPORTANT>
Reminder:
- You can use the <think></think> block to plan your next tool call OR to synthesize data and formulate your final response to the user.
- ALL explanation and reasoning MUST be placed strictly inside the <think></think> block.
- Function calls MUST follow the specified format: an inner <function=...></function> block must be nested within <tool_call></tool_call> XML tags.
- If you choose to call a tool, you MUST output the <tool_call> block IMMEDIATELY after closing </think>. Do NOT output any conversational text before the tool call.
- The <tool_call> and <function> tags MUST be at the very beginning of a new line, with NO spaces or indentation before them.
- To call multiple functions, output a separate, completely closed <tool_call></tool_call> block for EACH function. Do NOT nest <tool_call> blocks.
- If you have gathered all necessary data and do not need to call a tool, answer the question like normal and provide your final response to the user IMMEDIATELY after closing </think>.
</IMPORTANT>

It helped a bit, but did not solve it. What I think finally did it, was a complete rewrite of the KV cache handling, by setting preserve_thinking to true as default, and abolishing the empty think injection, which was poisoning the model's in-context learning.

xMASEx

2 days ago

Will give it a shot.

Thank you so much for your time and effort 🌹

slepkaviba

1 day ago

•

edited 1 day ago

Hey, I did check, and it works well, but issue I observed since v18, still persists with opencode and tool calls bleeding into the message.

Exploring the input and move control system now.
<function=aft_outline>
<parameter=target>
src/features/input
</parameter>
</function>
</tool_call>

Leaks into the prompt, and LLM stops...

froggeric

Owner 1 day ago

Strange, I have 0 instances of it across hundreds of tools calls since then. I am using F16, so maybe this is related to loss of intelligence in lower quants.

slepkaviba

1 day ago

•

edited 1 day ago

It happens ony with specific plugins..

Ones who modify messages (dcp, magic context).

And it happens with first message...

Without those plugins, it works amazing.

EDIT: Ok, this is for sure some context shananigans... As asking same question, with same prompt, just without context manipulation plugin - it works fine.

With plugins on - it bleeds either tools, thinking or both... And dies after first message, if second is a tool call...

I could provide you messages to compare - one with plugin enabled, and other without - if that is of any help?

froggeric

Owner 1 day ago

•

edited 1 day ago

In that case, I suspect those plugins manipulate the context history and KV cache, which is confusing the model on how to think, how to use tools, how to transition between states, etc. I would recommend not using any such plugin with any kind of model. It's akin to us, humans, have suddenly thoughts and memories disappearing throughout the day as we are working...

slepkaviba

1 day ago

Yea...

They manipulate messages a lot.

But that's "a" way to manage context

0x4tomic

1 day ago

I'd definitely pin that on OpenCode - I've had decent success with version 19. https://github.com/NousResearch/hermes-agent/issues/27339 is one such related issue (but for hermes-agent as opposed to OpenCode), so that's why I'm leaning towards the issue being the harness. Everyone is still learning quirks and "dos and don't"s.

I was using either v15 or v18 before with Hermes-Agent 27b (Opus 4.7 distilled) and it was losing its train of thought and getting into reasoning loops. I've been working through the kanban and it has been mostly reliable - no evidence of bad tool calls and it seems to behave well. Keep an eye on your harness's issue trackers!

slepkaviba

1 day ago

That might be. There are so many moving parts in this world..

mcr-ksh

about 18 hours ago

Claude Code is still unusable as of v19. Continuous break ups and stalls

slepkaviba

about 18 hours ago

I switched in opencode from openai to anthropic provider... And with magic context its stable. Still bleeds with dcp. .

froggeric

Owner about 16 hours ago

It works perfectly for me in Claude Code, using llama-server as an anthropic provider. I am using the Qwen 3.6 27b F16 gguf I published, with the chat template v19.

slepkaviba

about 16 hours ago

I think anthropic provider is important.

Became very stable (minus dcp injections).

mcr-ksh

about 15 hours ago

•

edited about 13 hours ago

im using vllm. I tried with and without on cyburn/Qwopus3.6-35B-A3B-v1-PrismaSCOUT-Blackwell-NVFP4-BF16-vllm-4.75bits with MTP and 256k context.
--default-chat-template-kwargs '{"enable_thinking": false, "auto_disable_thinking_with_tools": true, "max_tool_response_chars": 8192}'
While looking for a solution for my hangs, i've came accross settings default-chat-template-kwargs.

This is just an example of may hangs:

the comes up. However,

UPDATED: another random stop.

ManyOtherFunctions

about 14 hours ago

I'm actually getting a weird model stop in claude code on ls (running this in windows and powershell)
It happened both in native powershell and default ls)

and I'm not even sure if it fixes the thing where it reads contents of a PE / binary file and then permacrashes because its thinking has unicode tags.... still running that and hoping for the best.

froggeric

Owner about 14 hours ago

Please post issues in separate threads.

froggeric

Owner about 14 hours ago

I strongly recommend leaveing preserve_thinking to true (default in this template), and leaving thinking on (default as well). The models perform and reason a lot better.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment