Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
Auto-Analyst Backend - Getting Started Guide
π― Overview
This guide will help you set up and understand the Auto-Analyst backend system. Auto-Analyst is a multi-agent AI platform that orchestrates specialized agents for comprehensive data analysis.
ποΈ Core Concepts
1. Multi-Agent System
The platform uses specialized AI agents:
- Preprocessing Agent: Data cleaning and preparation
- Statistical Analytics Agent: Statistical analysis and insights
- Machine Learning Agent: Scikit-learn based modeling
- Data Visualization Agent: Chart and plot generation
2. Template System
- Individual Agents: Single-purpose agents for specific tasks
- Planner Agents: Multi-agent coordination for complex workflows
- User Templates: Customizable agent preferences
- Default vs Premium: Core agents available to all users
3. Session Management
- Session-based user tracking
- Shared DataFrame context between agents
- Conversation history and code execution tracking
4. Deep Analysis System
- Multi-step analysis workflow (questions β planning β execution β synthesis)
- Streaming progress updates
- HTML report generation
π Quick Start
1. Installation
# Clone and navigate to backend
cd Auto-Analyst-CS/auto-analyst-backend
# Create virtual environment
python -m venv venv
source venv/bin/activate # Linux/Mac
# or
venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
2. Environment Variables
Create .env
file with:
# Database
DATABASE_URL=sqlite:///./auto_analyst.db # For development
# DATABASE_URL=postgresql://user:pass@host:port/db # For production
# AI Models
ANTHROPIC_API_KEY=your_anthropic_key_here
OPENAI_API_KEY=your_openai_key_here
# Authentication (optional)
ADMIN_API_KEY=your_admin_key_here
3. Database Initialization
# Initialize database and default agents
python -c "
from src.db.init_db import init_db
init_db()
print('β
Database initialized successfully')
"
4. Start the Server
# Development server
python app.py
# Or with uvicorn
uvicorn app:app --reload --host 0.0.0.0 --port 8000
5. Verify Setup
Visit: http://localhost:8000/docs
for interactive API documentation
π Key Files to Understand
Core Application Files
app.py
- Main FastAPI application and core endpointssrc/agents/agents.py
- Agent definitions and orchestrationsrc/agents/deep_agents.py
- Deep analysis systemsrc/db/schemas/models.py
- Database modelssrc/managers/chat_manager.py
- Chat and session management
Route Files (API Endpoints)
src/routes/session_routes.py
- File uploads, sessions, authenticationsrc/routes/chat_routes.py
- Chat and messagingsrc/routes/code_routes.py
- Code execution and processingsrc/routes/templates_routes.py
- Agent template managementsrc/routes/deep_analysis_routes.py
- Deep analysis reportssrc/routes/analytics_routes.py
- Usage analytics and monitoring
Configuration Files
agents_config.json
- Agent and template definitionsrequirements.txt
- Python dependenciesalembic.ini
- Database migration configuration
π§ Development Workflow
1. Adding New Agents
# 1. Define agent signature in src/agents/agents.py
class new_agent(dspy.Signature):
"""Agent description"""
goal = dspy.InputField(desc="Analysis goal")
dataset = dspy.InputField(desc="Dataset info")
result = dspy.OutputField(desc="Analysis result")
# 2. Add to agents_config.json
{
"template_name": "new_agent",
"description": "Agent description",
"variant_type": "both",
"is_premium": false,
"usage_count": 0
}
# 3. Register in agent loading system
2. Adding New Endpoints
# 1. Create route in src/routes/feature_routes.py
from fastapi import APIRouter
router = APIRouter(prefix="/feature", tags=["feature"])
@router.get("/endpoint")
async def new_endpoint():
return {"message": "Hello"}
# 2. Register in app.py
from src.routes.feature_routes import router as feature_router
app.include_router(feature_router)
3. Database Changes
# 1. Modify models in src/db/schemas/models.py
# 2. Create migration
alembic revision --autogenerate -m "description"
# 3. Apply migration
alembic upgrade head
π§ͺ Testing Your Changes
1. Test API Endpoints
# Use the interactive docs
open http://localhost:8000/docs
# Or use curl
curl -X GET "http://localhost:8000/health"
2. Test Agent System
# Test individual agent
python -c "
from src.agents.agents import preprocessing_agent
import dspy
dspy.LM('anthropic/claude-sonnet-4-20250514')
agent = dspy.ChainOfThought(preprocessing_agent)
result = agent(goal='clean data', dataset='test data')
print(result)
"
3. Test Database Operations
# Test database
python -c "
from src.db.init_db import session_factory
from src.db.schemas.models import AgentTemplate
session = session_factory()
templates = session.query(AgentTemplate).all()
print(f'Found {len(templates)} templates')
session.close()
"
π Common Development Tasks
Adding a New Feature
- Plan the Feature: Define requirements and API design
- Database Changes: Add new models if needed
- Create Routes: Add API endpoints in
src/routes/
- Business Logic: Add managers in
src/managers/
if complex - Documentation: Update relevant
.md
files - Testing: Test endpoints and integration
Debugging Issues
- Check Logs: Application logs show detailed error information
- Database State: Verify data with database queries
- API Testing: Use
/docs
interface for endpoint testing - Agent Behavior: Test individual agents separately
Performance Optimization
- Database Queries: Use SQLAlchemy query optimization
- Agent Execution: Implement async patterns for agent orchestration
- Resource Management: Monitor memory usage for large datasets
π System Architecture Overview
graph TD
A[Frontend Request] --> B[FastAPI Router]
B --> C[Route Handler]
C --> D[Manager Layer]
D --> E[Database Layer]
D --> F[Agent System]
F --> G[AI Models]
G --> H[Code Generation]
H --> I[Execution Environment]
I --> J[Results Processing]
J --> K[Response]
subgraph "Agent Orchestration"
F1[Individual Agents]
F2[Planner Module]
F3[Deep Analysis]
F1 --> F2
F2 --> F3
end
F --> F1
π Template Integration
The system uses active user templates for agent selection:
Default Agents (Always Available)
preprocessing_agent
(individual & planner variants)statistical_analytics_agent
(individual & planner variants)sk_learn_agent
(individual & planner variants)data_viz_agent
(individual & planner variants)
Template Loading Logic
- Individual Agent Execution (
@agent_name
): Loads ALL available templates - Planner Execution: Loads user's enabled templates (max 10 for performance)
- Deep Analysis: Uses user's active template preferences
- Fallback: Uses 4 core agents if no user preferences found
This architecture ensures users can leverage their preferred agents while maintaining system performance and reliability.