Spaces:
Running
Running
File size: 12,145 Bytes
cb26688 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 |
# Unlocking Database Intelligence with AI Agents: A `smolagents` Tutorial
[Open In Colab](https://colab.research.google.com/github/huggingface/smolagents/blob/main/notebooks/text_to_sql.ipynb)
[Open In Studio Lab](https://studiolab.sagemaker.aws/import/github/huggingface/smolagents/blob/main/notebooks/text_to_sql.ipynb)
This guide explores how to develop an intelligent agent using the `smolagents` framework, specifically enabling it to interact with a SQL database.
---
## Beyond Simple Text-to-SQL: The Agent Advantage
Why opt for an advanced agent system instead of a straightforward text-to-SQL pipeline?
Traditional text-to-SQL solutions are often quite rigid. A direct translation from natural language to a database query can easily lead to syntactical errors, causing the database to reject the query. More insidiously, a query might execute without error but produce entirely incorrect or irrelevant results, providing no indication of its inaccuracy. This "silent failure" can be detrimental for critical applications.
👉 An agent-based system, conversely, possesses the crucial capability to **critically evaluate outputs and execution logs**. It can identify when a query has failed or yielded unexpected results, and then iteratively refine its strategy or reformulate the query. This inherent capacity for self-correction significantly boosts performance and reliability.
Let's dive into building such an agent! 💪
First, ensure all necessary libraries are installed by running the command below:
```bash
!pip install smolagents python-dotenv sqlalchemy --upgrade -q
```
To enable interaction with Large Language Models (LLMs) via inference providers, you'll need an authentication token, such as an `HF_TOKEN` from Hugging Face. We'll use `python-dotenv` to load this from your environment variables.
```python
from dotenv import load_dotenv
load_dotenv()
```
### Step 1: Database Initialization
We begin by setting up our in-memory SQLite database using `SQLAlchemy`. This involves defining our table structures and populating them with initial data.
```python
from sqlalchemy import (
create_engine,
MetaData,
Table,
Column,
String,
Integer,
Float,
insert,
inspect,
text, # Essential for executing raw SQL expressions
)
# Establish an in-memory SQLite database connection
engine = create_engine("sqlite:///:memory:")
metadata_obj = MetaData()
# Utility function for bulk data insertion
def insert_rows_into_table(rows, table, engine=engine):
for row in rows:
stmt = insert(table).values(**row)
with engine.begin() as connection:
connection.execute(stmt)
# Define the 'receipts' table schema
table_name = "receipts"
receipts = Table(
table_name,
metadata_obj,
Column("receipt_id", Integer, primary_key=True), # Unique identifier for each transaction
Column("customer_name", String(255)), # Full name of the patron
Column("price", Float), # Total cost of the receipt
Column("tip", Float), # Gratuity amount
)
# Create the defined table within our database
metadata_obj.create_all(engine)
# Sample transaction data
rows = [
{"receipt_id": 1, "customer_name": "Alan Payne", "price": 12.06, "tip": 1.20},
{"receipt_id": 2, "customer_name": "Alex Mason", "price": 23.86, "tip": 0.24},
{"receipt_id": 3, "customer_name": "Woodrow Wilson", "price": 53.43, "tip": 5.43},
{"receipt_id": 4, "customer_name": "Margaret James", "price": 21.11, "tip": 1.00},
]
# Populate the 'receipts' table
insert_rows_into_table(rows, receipts)
```
### Step 2: Crafting the Agent's Database Tool
For an AI agent to interact with a database, it requires specialized **tools**. Our `sql_engine` function will serve as this tool, allowing the agent to execute SQL queries.
The tool's docstring plays a critical role, as its content (the `description` attribute) is presented to the LLM by the agent system. This description guides the LLM on _how_ and _when_ to utilize the tool, including details about available tables and their column structures.
First, let's extract the schema details for our `receipts` table:
```python
inspector = inspect(engine)
columns_info = [(col["name"], col["type"]) for col in inspector.get_columns("receipts")]
table_description = "Columns:\n" + "\n".join([f" - {name}: {col_type}" for name, col_type in columns_info])
print(table_description)
```
```
Columns:
- receipt_id: INTEGER
- customer_name: VARCHAR(255)
- price: FLOAT
- tip: FLOAT
```
Now, we'll construct our `sql_engine` tool. Key elements include:
- The `@tool` decorator from `smolagents` to designate it as an agent capability.
- A comprehensive docstring, complete with an `Args:` section, to inform the LLM about the tool's purpose and expected inputs.
- Type hints for both input and output parameters, enhancing clarity and guiding the LLM's code generation.
```python
from smolagents import tool
@tool
def sql_engine(query: str) -> str:
"""
Enables execution of SQL queries against the database.
Outputs the query results as a formatted string.
Known tables and their column structures:
Table 'receipts':
Columns:
- receipt_id: INTEGER (Primary Key)
- customer_name: VARCHAR(255)
- price: FLOAT
- tip: FLOAT
Args:
query: The precise SQL query string to be executed.
Example: "SELECT customer_name FROM receipts WHERE price > 10.0;"
"""
output = ""
with engine.connect() as con:
# Utilize text() to safely execute raw SQL within SQLAlchemy
rows = con.execute(text(query))
for row in rows:
output += "\n" + str(row) # Converts each row of results into a string representation
return output
```
### Step 3: Assembling the AI Agent
With our database and tool ready, we now instantiate the `CodeAgent`. This is `smolagents’` flagship agent class, designed to generate and execute code, and to iteratively refine its actions based on the ReAct (Reasoning + Acting) framework.
The `model` parameter links our agent to a Large Language Model. `InferenceClientModel` facilitates access to LLMs via Hugging Face's Inference API, supporting both Serverless and Dedicated endpoints. Alternatively, you could integrate other proprietary LLM APIs.
```python
from smolagents import CodeAgent, InferenceClientModel
agent = CodeAgent(
tools=[sql_engine], # Provide the 'sql_engine' tool to our agent
model=InferenceClientModel(model_id="meta-llama/Llama-3.1-8B-Instruct"), # Selecting our LLM
)
```
### Step 4: Posing a Query to the Agent
Our agent is now configured. Let's challenge it with a natural language question. The agent will then leverage its LLM and `sql_engine` tool to find the answer.
```python
agent.run("Can you give me the name of the client who got the most expensive receipt?")
```
**Understanding the Agent's Iterative Solution Process:**
The `CodeAgent` employs a self-correcting, cyclical approach:
1. **Intent Comprehension:** The LLM interprets the request, identifying the need to find the "most expensive receipt."
2. **Tool Selection:** It recognizes that the `sql_engine` tool is necessary for database interaction.
3. **Initial Code Generation:** The agent generates its first attempt at a SQL query (e.g., `SELECT MAX(price) FROM receipts`) to get the maximum price. It then tries to use this result in a follow-up query.
4. **Execution and Feedback:** The `sql_engine` executes the query. However, the output is a string like `\n(53.43,)`. If the agent naively tries to embed this string directly into another SQL query (e.g., `WHERE price = (53.43,)`), it will encounter a `syntax error`.
5. **Adaptive Self-Correction:** Upon receiving an `OperationalError` (e.g., "syntax error" or "could not convert string to float"), the LLM analyzes the error. It understands that the string-formatted output needs to be correctly parsed into a numeric type before being used in subsequent SQL or Python logic. Previous attempts might fail due to unexpected characters (like newlines) or incorrect string manipulation.
6. **Refined Strategy:** Learning from its previous attempts, the agent eventually generates a more efficient, consolidated SQL query: `SELECT MAX(price), customer_name FROM receipts ORDER BY price DESC LIMIT 1`. This effectively retrieves both the highest price and the corresponding customer name in a single database call.
7. **Result Parsing and Finalization:** Finally, the LLM generates Python code to accurately parse the `\n(53.43, 'Woodrow Wilson')` string output from the `sql_engine`, extracting the customer name. It then provides the `final_answer`.
This continuous cycle of **reasoning, acting via tools, observing outcomes (including errors), and self-correction** is fundamental to the robustness and adaptability of agent-based systems.
---
### Level 2: Inter-Table Queries (Table Joins)
Let's elevate the complexity! Our goal now is to enable the agent to handle questions that require combining data from multiple tables using SQL joins.
To achieve this, we'll define a second table, `waiters`, which records the names of waiters associated with each `receipt_id`.
```python
# Define the 'waiters' table schema
table_name = "waiters"
waiters = Table(
table_name,
metadata_obj,
Column("receipt_id", Integer, primary_key=True), # Links to 'receipts' table
Column("waiter_name", String(16), primary_key=True), # Name of the assigned waiter
)
# Create the 'waiters' table in the database
metadata_obj.create_all(engine)
# Sample data for the 'waiters' table
rows = [
{"receipt_id": 1, "waiter_name": "Corey Johnson"},
{"receipt_id": 2, "waiter_name": "Michael Watts"},
{"receipt_id": 3, "waiter_name": "Michael Watts"},
{"receipt_id": 4, "waiter_name": "Margaret James"},
]
# Populate the 'waiters' table
insert_rows_into_table(rows, waiters)
```
With the introduction of a new table, it's crucial to **update the `sql_engine` tool's description**. This ensures the LLM is aware of the `waiters` table and its schema, allowing it to construct queries that span both tables.
```python
updated_description = """This tool allows performing SQL queries on the database, returning results as a string.
It can access the following tables:"""
inspector = inspect(engine)
for table in ["receipts", "waiters"]:
columns_info = [(col["name"], col["type"]) for col in inspector.get_columns(table)]
table_description = f"Table '{table}':\n"
table_description += " Columns:\n" + "\n".join([f" - {name}: {col_type}" for name, col_type in columns_info])
updated_description += "\n\n" + table_description
print(updated_description)
```
For more intricate requests like this, switching to a more powerful LLM can significantly enhance the agent's reasoning capabilities. Here, we'll upgrade to `Qwen/Qwen2.5-Coder-32B-Instruct`.
```python
# Assign the updated description to the tool
sql_engine.description = updated_description
agent = CodeAgent(
tools=[sql_engine],
model=InferenceClientModel(model_id="Qwen/Qwen2.5-Coder-32B-Instruct"),
)
agent.run("Which waiter received the highest total amount in tips?")
```
The agent successfully addresses this challenge, often directly formulating the correct SQL query involving a `JOIN` operation, and then performing the necessary calculations in Python. The simplicity of setup versus the complexity of the task handled demonstrates the power of this agentic approach!
This tutorial covered several key concepts:
- **Constructing custom tools** for agents.
- **Dynamically updating a tool's description** to reflect changes in available data or functionalities.
- **Leveraging stronger LLMs** to empower an agent's reasoning for more complex tasks.
✅ You are now equipped to start building your own advanced text-to-SQL systems! ✨
|