Issue of tool call generation
The chat template works correctly with the gguf version of the model. However, when I use the same model (Q5_K_M) to test the tool use
prompt, it seems the model does not work correctly: the tool call
generated does not show the function name. In addition, what does the prompt for tool use
looks like if there are two or more available tools? Could you please provide concrete examples for the tool use
case? Thanks a lot!
- The prompt string used is
<extra_id_0>System\nYou are a helpful, respectful and honest assistant. Always answer as short as possible, while being safe.
<tool> {"type":"function","function":{"name":"get_current_weather","description":"Get the current weather in a given location","parameters":{"type":"object","properties":{"location":{"type":"string","description":"The city and state, e.g. San Francisco, CA"},"format":{"type":"string","description":"The temperature unit to use. Infer this from the users location.","enum":["celsius","fahrenheit"]}},"required":["location","format"]}}} </tool>
<extra_id_1>User
What is the weather like in Beijing now?
<extra_id_1>Assistant
- The generated
tool call
is
<toolcall> {"type": "function", "arguments": {"location": "Beijing"}} </toolcall></s>
You don't need the "type" parameter in the tool description. Here's an example:
Input:
<extra_id_0>System
<tool> {"name": "get_current_weather", "description": "Get the current weather in a given location", "parameters": {"type": "object", "properties": {"location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"}, "format": {"type": "string", "description": "The temperature unit to use. Infer this from the users location.", "enum": ["celsius", "fahrenheit"]}}, "required": ["location", "format"]}} </tool>
<extra_id_1>User
What is the weather like in Beijing now?
<extra_id_1>Assistant\n
Output:
<toolcall> {"name": "get_current_weather", "arguments": {"location": "Beijing", "format": "celsius"}} </toolcall>
For multiple available tools, you just need to put corresponding <tool> ... </tool>
blocks. Example:
Input:
<extra_id_0>System
<tool> {"name": "get_current_weather", "description": "Get the current weather in a given location", "parameters": {"type": "object", "properties": {"location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"}, "format": {"type": "string", "description": "The temperature unit to use. Infer this from the users location.", "enum": ["celsius", "fahrenheit"]}}, "required": ["location", "format"]}} </tool>
<tool> {"name": "get_stock_price", "description": "Get the current stock price", "parameters": {"type": "object", "properties": {"symbol": {"type": "string", "description": "The stock symbol"}}, "required": ["symbol"]}} </tool>
<extra_id_1>User
What is the weather like in Beijing now and what's the stock price of NVDA?
<extra_id_1>Assistant\n
Output:
<toolcall> {"name": "get_current_weather", "arguments": {"location": "Beijing"}} </toolcall> <toolcall> {"name": "get_stock_price", "arguments": {"symbol": "NVDA"}} </toolcall>
This can also be done using HF's apply_chat_template
utility:
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda"
model_name = "nvidia/Nemotron-Mini-4B-Instruct"
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name)
tools = [
{"name": "get_current_weather", "description": "Get the current weather in a given location", "parameters": {"type":"object","properties": {"location": {"type":"string","description":"The city and state, e.g. San Francisco, CA"}, "format": {"type": "string", "description": "The temperature unit to use. Infer this from the users location.", "enum": ["celsius", "fahrenheit"]}}, "required": ["location", "format"]}},
{"name": "get_stock_price", "description": "Get the current stock price", "parameters": {"type": "object", "properties": {"symbol": {"type": "string", "description": "The stock symbol"}}, "required": ["symbol"]}}
]
messages = [
{"role": "user", "content": "What is the weather like in Beijing now and what's the stock price of NVDA?"}
]
tokenized_prompt = tokenizer.apply_chat_template(messages, tools=tools, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(device)
outputs = model.generate(tokenized_prompt, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Thanks for your reply. It works by following your instruction, and the examples are also very helpful.
I have another question about the <context> ... </context>
part. Is it for RAG?
Yes, the context blocks can be used for providing documents for Q&A similar to how tools are provided.
<extra_id_0>System
<context> Beyoncé Giselle Knowles-Carter (/biːˈjɒnseɪ/ bee-YON-say) (born September 4, 1981) is an American singer, songwriter, ... </context>
<context> Following the disbandment of Destiny's Child in June 2005, she released her second solo album, B'Day (2006), which contained hits "Déjà Vu", "Irreplaceable", ... </context>
<extra_id_1>User
When did Beyonce start becoming popular?
<extra_id_1>Assistant\n
And using apply_chat_template
contexts = [
"Beyoncé Giselle Knowles-Carter (/biːˈjɒnseɪ/ bee-YON-say) (born September 4, 1981) is an American singer, songwriter, ...",
"Following the disbandment of Destiny's Child in June 2005, she released her second solo album, B'Day (2006), which contained hits \"Déjà Vu\", \"Irreplaceable\", ..."
]
messages = [
{"role": "user", "content": "When did Beyonce start becoming popular?"}
]
tokenized_prompt = tokenizer.apply_chat_template(messages, contexts=contexts, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(device)
Both contexts and tools can also be passed together using the respective formats.
Got it. Thank you so much for the hep!