Text Generation
Safetensors
enggpt_moe
conversational
custom_code

Invalid tool-call format produced in multi-turn tool use (chat template)

#6
by alescire94 - opened

Problem

This concerns multi-turn tool calling, where a prior assistant tool call is part of the conversation history passed back to the template.

Symptom. When the messages passed to apply_chat_template include an assistant message with tool_calls, the template renders it as a single-quoted Python dict instead of JSON:

<tool_call>
{'id': 'call_1', 'type': 'function', 'function': {'name': 'search', 'arguments': '{"query": "ID card"}'}}
</tool_call>

This is not valid JSON, and it does not match the format chat_template.jinja requires on line 14: {"name": ..., "arguments": ...}.

Impact. Two consequences: (1) the <tool_call> body is no longer valid JSON, so any parser of it fails; (2) the model is shown its own prior call in this malformed form and imitates it from round 2 on, emitting invalid tool calls itself.

Cause. chat_template.jinja line 14 instructs:

For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
<tool_call>
{"name": <function-name>, "arguments": <args-json-object>}
</tool_call>

But chat_template.jinja line 97 (loop at lines 92–93) renders the tool call by stringifying the whole object:

{{- '<tool_call>\n' ~ tool_call ~ '\n</tool_call>' }}

~ tool_call ~ uses Python str() → single quotes + the full id/type/function envelope.

(The same template is also embedded in tokenizer_config.jsonchat_template.)

Reproduction

from transformers import AutoTokenizer
tok = AutoTokenizer.from_pretrained("engineering-group/EngGPT2-16B-A3B", trust_remote_code=True)

messages = [
    {"role": "user", "content": "How do I request an ID card?"},
    {"role": "assistant", "content": None, "tool_calls": [
        {"id": "call_1", "type": "function",
         "function": {"name": "search", "arguments": '{"query": "ID card"}'}}]},
]
print(tok.apply_chat_template(messages, add_generation_prompt=True, tokenize=False))

Actual:

<tool_call>
{'id': 'call_1', 'type': 'function', 'function': {'name': 'search', 'arguments': '{"query": "ID card"}'}}
</tool_call>

Expected (per chat_template.jinja line 14):

<tool_call>
{"name": "search", "arguments": {"query": "ID card"}}
</tool_call>

Fix (chat_template.jinja line 97)

Build the JSON from the subfields instead of stringifying the object:

{{- '<tool_call>\n{"name": "' ~ tool_call.function.name ~ '", "arguments": ' ~ tool_call.function.arguments ~ '}\n</tool_call>' }}

Verified: patching only chat_template.jinja line 97 makes the historical call render as valid JSON matching line 14.


Tested on transformers==4.57.1, revision 6b8721551a706cc6a70d928fa467b6edbe989a8b.

Sign up or log in to comment