Fix tool calling model in unit2/tool_calling_agents

#135

by ananyaem - opened 4 days ago

base: refs/heads/main

←

from: refs/pr/135

Discussion Files changed

-2

ananyaem

4 days ago

No description provided.

Fix tool calling model to Qwen2.5-Coder-32B-Instruct920152d6

ananyaem

4 days ago

Summary

Fixes ToolCallingAgent failing in tool_calling_agents.ipynb with a 400 Bad Request from the Hugging Face Inference API.

smolagents 1.25.0 changed the default InferenceClientModel to Qwen/Qwen3-Next-80B-A3B-Thinking. That model does not support tool calling on the router endpoint, so the first agent step fails when tools are sent. The notebook now explicitly uses Qwen/Qwen2.5-Coder-32B-Instruct, which matches the other Unit 2 smolagents notebooks and works with ToolCallingAgent.

Changes

unit2/smolagents/tool_calling_agents.ipynb: pass model_id="Qwen/Qwen2.5-Coder-32B-Instruct" to InferenceClientModel instead of relying on the library default.

Test plan

Run the HF login cell in tool_calling_agents.ipynb
Run the ToolCallingAgent example cell; confirm it completes without AgentGenerationError / 400
Confirm the agent banner shows InferenceClientModel - Qwen/Qwen2.5-Coder-32B-Instruct
Confirm a tool call (e.g. web_search) appears in the trace output

ananyaem changed pull request status to open 4 days ago

ananyaem changed pull request title from Fix tool calling model to Fix tool calling model unit2/tool_calling_agent 4 days ago

ananyaem changed pull request title from Fix tool calling model unit2/tool_calling_agent to Fix tool calling model in unit2/tool_calling_agents 4 days ago

agentgraph-official

1 day ago

Good catch on the tool calling fix. Looking at the agents-course/notebooks repo, the tool calling agent examples in unit2 are particularly sensitive to model selection because not all models expose a consistent function-calling interface — some handle the JSON schema for tool definitions differently, and others silently fall back to text generation without actually invoking the tool, which is a subtle failure mode that's hard to catch without careful output inspection.

The core issue with tool calling agents is that the model needs to reliably emit structured outputs that conform to the tool schema, and smaller or older checkpoints in the hub often have inconsistent behavior here. If the fix involves swapping to a model with better function-calling support (something like a recent Mistral or Qwen variant with explicit tool-use fine-tuning), it's worth documenting in the notebook which capability specifically broke and why the replacement handles it correctly. That context is genuinely useful for learners trying to understand the boundary between model capability and agent scaffolding.

On a related note, in multi-agent setups where tool-calling agents are chained or delegated to, this kind of silent tool failure becomes a trust and verification problem as much as a model capability problem. In work we've done on AgentGraph, tracking which agent invoked which tool and whether the tool call was well-formed vs. silently degraded is critical for debugging orchestration issues — similar in spirit to what the SteelSpine replay approach is trying to solve for agent debugging more broadly. For the course notebooks it's probably out of scope, but worth flagging that the fix here is load-bearing if these examples get used as templates for more complex pipelines.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment