Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
# Auto-Analyst Backend - Getting Started Guide | |
## π― Overview | |
This guide will help you set up and understand the Auto-Analyst backend system. Auto-Analyst is a multi-agent AI platform that orchestrates specialized agents for comprehensive data analysis. | |
## ποΈ Core Concepts | |
### 1. **Multi-Agent System** | |
The platform uses specialized AI agents: | |
- **Preprocessing Agent**: Data cleaning and preparation | |
- **Statistical Analytics Agent**: Statistical analysis and insights | |
- **Machine Learning Agent**: Scikit-learn based modeling | |
- **Data Visualization Agent**: Chart and plot generation | |
### 2. **Template System** | |
- **Individual Agents**: Single-purpose agents for specific tasks | |
- **Planner Agents**: Multi-agent coordination for complex workflows | |
- **User Templates**: Customizable agent preferences | |
- **Default vs Premium**: Core agents available to all users | |
### 3. **Session Management** | |
- Session-based user tracking | |
- Shared DataFrame context between agents | |
- Conversation history and code execution tracking | |
### 4. **Deep Analysis System** | |
- Multi-step analysis workflow (questions β planning β execution β synthesis) | |
- Streaming progress updates | |
- HTML report generation | |
## π Quick Start | |
### 1. Installation | |
```bash | |
# Clone and navigate to backend | |
cd Auto-Analyst-CS/auto-analyst-backend | |
# Create virtual environment | |
python -m venv venv | |
source venv/bin/activate # Linux/Mac | |
# or | |
venv\Scripts\activate # Windows | |
# Install dependencies | |
pip install -r requirements.txt | |
``` | |
### 2. Environment Variables | |
Create `.env` file with: | |
```env | |
# Database | |
DATABASE_URL=sqlite:///./auto_analyst.db # For development | |
# DATABASE_URL=postgresql://user:pass@host:port/db # For production | |
# AI Models | |
ANTHROPIC_API_KEY=your_anthropic_key_here | |
OPENAI_API_KEY=your_openai_key_here | |
# Authentication (optional) | |
ADMIN_API_KEY=your_admin_key_here | |
``` | |
### 3. Database Initialization | |
```bash | |
# Initialize database and default agents | |
python -c " | |
from src.db.init_db import init_db | |
init_db() | |
print('β Database initialized successfully') | |
" | |
``` | |
### 4. Start the Server | |
```bash | |
# Development server | |
python app.py | |
# Or with uvicorn | |
uvicorn app:app --reload --host 0.0.0.0 --port 8000 | |
``` | |
### 5. Verify Setup | |
Visit: `http://localhost:8000/docs` for interactive API documentation | |
## π Key Files to Understand | |
### Core Application Files | |
1. **`app.py`** - Main FastAPI application and core endpoints | |
2. **`src/agents/agents.py`** - Agent definitions and orchestration | |
3. **`src/agents/deep_agents.py`** - Deep analysis system | |
4. **`src/db/schemas/models.py`** - Database models | |
5. **`src/managers/chat_manager.py`** - Chat and session management | |
### Route Files (API Endpoints) | |
- **`src/routes/session_routes.py`** - File uploads, sessions, authentication | |
- **`src/routes/chat_routes.py`** - Chat and messaging | |
- **`src/routes/code_routes.py`** - Code execution and processing | |
- **`src/routes/templates_routes.py`** - Agent template management | |
- **`src/routes/deep_analysis_routes.py`** - Deep analysis reports | |
- **`src/routes/analytics_routes.py`** - Usage analytics and monitoring | |
### Configuration Files | |
- **`agents_config.json`** - Agent and template definitions | |
- **`requirements.txt`** - Python dependencies | |
- **`alembic.ini`** - Database migration configuration | |
## π§ Development Workflow | |
### 1. Adding New Agents | |
```python | |
# 1. Define agent signature in src/agents/agents.py | |
class new_agent(dspy.Signature): | |
"""Agent description""" | |
goal = dspy.InputField(desc="Analysis goal") | |
dataset = dspy.InputField(desc="Dataset info") | |
result = dspy.OutputField(desc="Analysis result") | |
# 2. Add to agents_config.json | |
{ | |
"template_name": "new_agent", | |
"description": "Agent description", | |
"variant_type": "both", | |
"is_premium": false, | |
"usage_count": 0 | |
} | |
# 3. Register in agent loading system | |
``` | |
### 2. Adding New Endpoints | |
```python | |
# 1. Create route in src/routes/feature_routes.py | |
from fastapi import APIRouter | |
router = APIRouter(prefix="/feature", tags=["feature"]) | |
@router.get("/endpoint") | |
async def new_endpoint(): | |
return {"message": "Hello"} | |
# 2. Register in app.py | |
from src.routes.feature_routes import router as feature_router | |
app.include_router(feature_router) | |
``` | |
### 3. Database Changes | |
```bash | |
# 1. Modify models in src/db/schemas/models.py | |
# 2. Create migration | |
alembic revision --autogenerate -m "description" | |
# 3. Apply migration | |
alembic upgrade head | |
``` | |
## π§ͺ Testing Your Changes | |
### 1. Test API Endpoints | |
```bash | |
# Use the interactive docs | |
open http://localhost:8000/docs | |
# Or use curl | |
curl -X GET "http://localhost:8000/health" | |
``` | |
### 2. Test Agent System | |
```python | |
# Test individual agent | |
python -c " | |
from src.agents.agents import preprocessing_agent | |
import dspy | |
dspy.LM('anthropic/claude-sonnet-4-20250514') | |
agent = dspy.ChainOfThought(preprocessing_agent) | |
result = agent(goal='clean data', dataset='test data') | |
print(result) | |
" | |
``` | |
### 3. Test Database Operations | |
```python | |
# Test database | |
python -c " | |
from src.db.init_db import session_factory | |
from src.db.schemas.models import AgentTemplate | |
session = session_factory() | |
templates = session.query(AgentTemplate).all() | |
print(f'Found {len(templates)} templates') | |
session.close() | |
" | |
``` | |
## π Common Development Tasks | |
### Adding a New Feature | |
1. **Plan the Feature**: Define requirements and API design | |
2. **Database Changes**: Add new models if needed | |
3. **Create Routes**: Add API endpoints in `src/routes/` | |
4. **Business Logic**: Add managers in `src/managers/` if complex | |
5. **Documentation**: Update relevant `.md` files | |
6. **Testing**: Test endpoints and integration | |
### Debugging Issues | |
1. **Check Logs**: Application logs show detailed error information | |
2. **Database State**: Verify data with database queries | |
3. **API Testing**: Use `/docs` interface for endpoint testing | |
4. **Agent Behavior**: Test individual agents separately | |
### Performance Optimization | |
1. **Database Queries**: Use SQLAlchemy query optimization | |
2. **Agent Execution**: Implement async patterns for agent orchestration | |
3. **Resource Management**: Monitor memory usage for large datasets | |
## π System Architecture Overview | |
```mermaid | |
graph TD | |
A[Frontend Request] --> B[FastAPI Router] | |
B --> C[Route Handler] | |
C --> D[Manager Layer] | |
D --> E[Database Layer] | |
D --> F[Agent System] | |
F --> G[AI Models] | |
G --> H[Code Generation] | |
H --> I[Execution Environment] | |
I --> J[Results Processing] | |
J --> K[Response] | |
subgraph "Agent Orchestration" | |
F1[Individual Agents] | |
F2[Planner Module] | |
F3[Deep Analysis] | |
F1 --> F2 | |
F2 --> F3 | |
end | |
F --> F1 | |
``` | |
## π Template Integration | |
The system uses **active user templates** for agent selection: | |
### Default Agents (Always Available) | |
- `preprocessing_agent` (individual & planner variants) | |
- `statistical_analytics_agent` (individual & planner variants) | |
- `sk_learn_agent` (individual & planner variants) | |
- `data_viz_agent` (individual & planner variants) | |
### Template Loading Logic | |
1. **Individual Agent Execution** (`@agent_name`): Loads ALL available templates | |
2. **Planner Execution**: Loads user's enabled templates (max 10 for performance) | |
3. **Deep Analysis**: Uses user's active template preferences | |
4. **Fallback**: Uses 4 core agents if no user preferences found | |
This architecture ensures users can leverage their preferred agents while maintaining system performance and reliability. |