Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
| # Auto-Analyst Backend - Getting Started Guide | |
| ## π― Overview | |
| This guide will help you set up and understand the Auto-Analyst backend system. Auto-Analyst is a multi-agent AI platform that orchestrates specialized agents for comprehensive data analysis. | |
| ## ποΈ Core Concepts | |
| ### 1. **Multi-Agent System** | |
| The platform uses specialized AI agents: | |
| - **Preprocessing Agent**: Data cleaning and preparation | |
| - **Statistical Analytics Agent**: Statistical analysis and insights | |
| - **Machine Learning Agent**: Scikit-learn based modeling | |
| - **Data Visualization Agent**: Chart and plot generation | |
| ### 2. **Template System** | |
| - **Individual Agents**: Single-purpose agents for specific tasks | |
| - **Planner Agents**: Multi-agent coordination for complex workflows | |
| - **User Templates**: Customizable agent preferences | |
| - **Default vs Premium**: Core agents available to all users | |
| ### 3. **Session Management** | |
| - Session-based user tracking | |
| - Shared DataFrame context between agents | |
| - Conversation history and code execution tracking | |
| ### 4. **Deep Analysis System** | |
| - Multi-step analysis workflow (questions β planning β execution β synthesis) | |
| - Streaming progress updates | |
| - HTML report generation | |
| ## π Quick Start | |
| ### 1. Installation | |
| ```bash | |
| # Clone and navigate to backend | |
| cd Auto-Analyst-CS/auto-analyst-backend | |
| # Create virtual environment | |
| python -m venv venv | |
| source venv/bin/activate # Linux/Mac | |
| # or | |
| venv\Scripts\activate # Windows | |
| # Install dependencies | |
| pip install -r requirements.txt | |
| ``` | |
| ### 2. Environment Variables | |
| Create `.env` file with: | |
| ```env | |
| # Database | |
| DATABASE_URL=sqlite:///./auto_analyst.db # For development | |
| # DATABASE_URL=postgresql://user:pass@host:port/db # For production | |
| # AI Models | |
| ANTHROPIC_API_KEY=your_anthropic_key_here | |
| OPENAI_API_KEY=your_openai_key_here | |
| # Authentication (optional) | |
| ADMIN_API_KEY=your_admin_key_here | |
| ``` | |
| ### 3. Database Initialization | |
| ```bash | |
| # Initialize database and default agents | |
| python -c " | |
| from src.db.init_db import init_db | |
| init_db() | |
| print('β Database initialized successfully') | |
| " | |
| ``` | |
| ### 4. Start the Server | |
| ```bash | |
| # Development server | |
| python app.py | |
| # Or with uvicorn | |
| uvicorn app:app --reload --host 0.0.0.0 --port 8000 | |
| ``` | |
| ### 5. Verify Setup | |
| Visit: `http://localhost:8000/docs` for interactive API documentation | |
| ## π Key Files to Understand | |
| ### Core Application Files | |
| 1. **`app.py`** - Main FastAPI application and core endpoints | |
| 2. **`src/agents/agents.py`** - Agent definitions and orchestration | |
| 3. **`src/agents/deep_agents.py`** - Deep analysis system | |
| 4. **`src/db/schemas/models.py`** - Database models | |
| 5. **`src/managers/chat_manager.py`** - Chat and session management | |
| ### Route Files (API Endpoints) | |
| - **`src/routes/session_routes.py`** - File uploads, sessions, authentication | |
| - **`src/routes/chat_routes.py`** - Chat and messaging | |
| - **`src/routes/code_routes.py`** - Code execution and processing | |
| - **`src/routes/templates_routes.py`** - Agent template management | |
| - **`src/routes/deep_analysis_routes.py`** - Deep analysis reports | |
| - **`src/routes/analytics_routes.py`** - Usage analytics and monitoring | |
| ### Configuration Files | |
| - **`agents_config.json`** - Agent and template definitions | |
| - **`requirements.txt`** - Python dependencies | |
| - **`alembic.ini`** - Database migration configuration | |
| ## π§ Development Workflow | |
| ### 1. Adding New Agents | |
| ```python | |
| # 1. Define agent signature in src/agents/agents.py | |
| class new_agent(dspy.Signature): | |
| """Agent description""" | |
| goal = dspy.InputField(desc="Analysis goal") | |
| dataset = dspy.InputField(desc="Dataset info") | |
| result = dspy.OutputField(desc="Analysis result") | |
| # 2. Add to agents_config.json | |
| { | |
| "template_name": "new_agent", | |
| "description": "Agent description", | |
| "variant_type": "both", | |
| "is_premium": false, | |
| "usage_count": 0 | |
| } | |
| # 3. Register in agent loading system | |
| ``` | |
| ### 2. Adding New Endpoints | |
| ```python | |
| # 1. Create route in src/routes/feature_routes.py | |
| from fastapi import APIRouter | |
| router = APIRouter(prefix="/feature", tags=["feature"]) | |
| @router.get("/endpoint") | |
| async def new_endpoint(): | |
| return {"message": "Hello"} | |
| # 2. Register in app.py | |
| from src.routes.feature_routes import router as feature_router | |
| app.include_router(feature_router) | |
| ``` | |
| ### 3. Database Changes | |
| ```bash | |
| # 1. Modify models in src/db/schemas/models.py | |
| # 2. Create migration | |
| alembic revision --autogenerate -m "description" | |
| # 3. Apply migration | |
| alembic upgrade head | |
| ``` | |
| ## π§ͺ Testing Your Changes | |
| ### 1. Test API Endpoints | |
| ```bash | |
| # Use the interactive docs | |
| open http://localhost:8000/docs | |
| # Or use curl | |
| curl -X GET "http://localhost:8000/health" | |
| ``` | |
| ### 2. Test Agent System | |
| ```python | |
| # Test individual agent | |
| python -c " | |
| from src.agents.agents import preprocessing_agent | |
| import dspy | |
| dspy.LM('anthropic/claude-sonnet-4-20250514') | |
| agent = dspy.ChainOfThought(preprocessing_agent) | |
| result = agent(goal='clean data', dataset='test data') | |
| print(result) | |
| " | |
| ``` | |
| ### 3. Test Database Operations | |
| ```python | |
| # Test database | |
| python -c " | |
| from src.db.init_db import session_factory | |
| from src.db.schemas.models import AgentTemplate | |
| session = session_factory() | |
| templates = session.query(AgentTemplate).all() | |
| print(f'Found {len(templates)} templates') | |
| session.close() | |
| " | |
| ``` | |
| ## π Common Development Tasks | |
| ### Adding a New Feature | |
| 1. **Plan the Feature**: Define requirements and API design | |
| 2. **Database Changes**: Add new models if needed | |
| 3. **Create Routes**: Add API endpoints in `src/routes/` | |
| 4. **Business Logic**: Add managers in `src/managers/` if complex | |
| 5. **Documentation**: Update relevant `.md` files | |
| 6. **Testing**: Test endpoints and integration | |
| ### Debugging Issues | |
| 1. **Check Logs**: Application logs show detailed error information | |
| 2. **Database State**: Verify data with database queries | |
| 3. **API Testing**: Use `/docs` interface for endpoint testing | |
| 4. **Agent Behavior**: Test individual agents separately | |
| ### Performance Optimization | |
| 1. **Database Queries**: Use SQLAlchemy query optimization | |
| 2. **Agent Execution**: Implement async patterns for agent orchestration | |
| 3. **Resource Management**: Monitor memory usage for large datasets | |
| ## π System Architecture Overview | |
| ```mermaid | |
| graph TD | |
| A[Frontend Request] --> B[FastAPI Router] | |
| B --> C[Route Handler] | |
| C --> D[Manager Layer] | |
| D --> E[Database Layer] | |
| D --> F[Agent System] | |
| F --> G[AI Models] | |
| G --> H[Code Generation] | |
| H --> I[Execution Environment] | |
| I --> J[Results Processing] | |
| J --> K[Response] | |
| subgraph "Agent Orchestration" | |
| F1[Individual Agents] | |
| F2[Planner Module] | |
| F3[Deep Analysis] | |
| F1 --> F2 | |
| F2 --> F3 | |
| end | |
| F --> F1 | |
| ``` | |
| ## π Template Integration | |
| The system uses **active user templates** for agent selection: | |
| ### Default Agents (Always Available) | |
| - `preprocessing_agent` (individual & planner variants) | |
| - `statistical_analytics_agent` (individual & planner variants) | |
| - `sk_learn_agent` (individual & planner variants) | |
| - `data_viz_agent` (individual & planner variants) | |
| ### Template Loading Logic | |
| 1. **Individual Agent Execution** (`@agent_name`): Loads ALL available templates | |
| 2. **Planner Execution**: Loads user's enabled templates (max 10 for performance) | |
| 3. **Deep Analysis**: Uses user's active template preferences | |
| 4. **Fallback**: Uses 4 core agents if no user preferences found | |
| This architecture ensures users can leverage their preferred agents while maintaining system performance and reliability. |