Agents Course documentation
Writing actions as code snippets or JSON blobs
Writing actions as code snippets or JSON blobs
Tool Calling Agents are the second type of agent available in smolagents
. Unlike Code Agents that use Python snippets, these agents use the built-in tool-calling capabilities of LLM providers to generate tool calls as JSON structures. This is the standard approach used by OpenAI, Anthropic, and many other providers.
Letโs look at an example. When Alfred wants to search for catering services and party ideas, a CodeAgent
would generate and run Python code like this:
for query in [
"Best catering services in Gotham City",
"Party theme ideas for superheroes"
]:
print(web_search(f"Search for: {query}"))
A ToolCallingAgent
would instead create a JSON structure:
[
{"name": "web_search", "arguments": "Best catering services in Gotham City"},
{"name": "web_search", "arguments": "Party theme ideas for superheroes"}
]
This JSON blob is then used to execute the tool calls.
While smolagents
primarily focuses on CodeAgents
since they perform better overall, ToolCallingAgents
can be effective for simple systems that donโt require variable handling or complex tool calls.
How Do Tool Calling Agents Work?
Tool Calling Agents follow the same multi-step workflow as Code Agents (see the previous section for details).
The key difference is in how they structure their actions: instead of executable code, they generate JSON objects that specify tool names and arguments. The system then parses these instructions to execute the appropriate tools.
Example: Running a Tool Calling Agent
Letโs revisit the previous example where Alfred started party preparations, but this time weโll use a ToolCallingAgent
to highlight the difference. Weโll build an agent that can search the web using DuckDuckGo, just like in our Code Agent example. The only difference is the agent type - the framework handles everything else:
from smolagents import ToolCallingAgent, DuckDuckGoSearchTool, HfApiModel
agent = ToolCallingAgent(tools=[DuckDuckGoSearchTool()], model=HfApiModel())
agent.run("Search for the best music recommendations for a party at the Wayne's mansion.")
When you examine the agentโs trace, instead of seeing Executing parsed code:
, youโll see something like:
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ โ Calling tool: 'web_search' with arguments: {'query': "best music recommendations for a party at Wayne's โ โ mansion"} โ โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
The agent generates a structured tool call that the system processes to produce the output, rather than directly executing code like a CodeAgent
.
Now that we understand both agent types, we can choose the right one for our needs. Letโs continue exploring smolagents
to make Alfredโs party a success! ๐
Resources
- ToolCallingAgent documentation - Official documentation for ToolCallingAgent