Spaces:
Running
on
CPU Upgrade
Auto-Analyst Backend Troubleshooting Guide
π¨ Common Startup Issues
1. Database Connection Problems
Problem: Database connection failed
β sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: users
Solutions:
Initialize Database:
python -c " from src.db.init_db import init_database from src.db.init_default_agents import initialize_default_agents init_database() initialize_default_agents() print('β Database initialized') "
Check Database File Permissions:
# For SQLite ls -la auto_analyst.db chmod 666 auto_analyst.db # If needed
Verify DATABASE_URL:
# Check .env file cat .env | grep DATABASE_URL # For PostgreSQL (production) DATABASE_URL=postgresql://user:password@host:port/database # For SQLite (development) DATABASE_URL=sqlite:///./auto_analyst.db
Problem: PostgreSQL connection issues
β psycopg2.OperationalError: FATAL: database "auto_analyst" does not exist
Solutions:
Create Database:
-- Connect to PostgreSQL psql -h localhost -U postgres CREATE DATABASE auto_analyst; \q
Update Connection String:
DATABASE_URL=postgresql://username:password@localhost:5432/auto_analyst
2. Agent Template Loading Issues
Problem: No agents found
β RuntimeError: No agents loaded for user. Cannot proceed with analysis.
Solutions:
Initialize Default Agents:
python -c " from src.db.init_default_agents import initialize_default_agents initialize_default_agents() print('β Default agents initialized') "
Check Agent Templates in Database:
python -c " from src.db.init_db import session_factory from src.db.schemas.models import AgentTemplate session = session_factory() templates = session.query(AgentTemplate).all() print(f'Found {len(templates)} templates:') for t in templates: print(f' - {t.template_name}: {t.is_active}') session.close() "
Populate Templates from Config:
python scripts/populate_agent_templates.py
3. API Key Issues
Problem: Missing API keys
β AuthenticationError: Invalid API key provided
Solutions:
Check Environment Variables:
# Verify API keys are set echo $ANTHROPIC_API_KEY echo $OPENAI_API_KEY # Or check .env file cat .env | grep API_KEY
Add Missing Keys:
# Add to .env file ANTHROPIC_API_KEY=sk-ant-api03-... OPENAI_API_KEY=sk-... ADMIN_API_KEY=your_admin_key_here
Test API Key Validity:
python -c " import os from anthropic import Anthropic client = Anthropic(api_key=os.getenv('ANTHROPIC_API_KEY')) try: # Test call response = client.messages.create( model='claude-3-sonnet-20241022', max_tokens=10, messages=[{'role': 'user', 'content': 'Hello'}] ) print('β Anthropic API key valid') except Exception as e: print(f'β Anthropic API key invalid: {e}') "
π€ Agent System Issues
1. Agent Not Found Errors
Problem: Specific agent not available
β KeyError: 'custom_agent' not found in loaded agents
Solutions:
Check Available Agents:
python -c " from src.agents.agents import load_user_enabled_templates_from_db from src.db.init_db import session_factory session = session_factory() agents = load_user_enabled_templates_from_db('test_user', session) print('Available agents:', list(agents.keys())) session.close() "
Verify Agent Template Exists:
python -c " from src.db.init_db import session_factory from src.db.schemas.models import AgentTemplate session = session_factory() agent = session.query(AgentTemplate).filter_by(template_name='custom_agent').first() if agent: print(f'Agent found: {agent.display_name}, Active: {agent.is_active}') else: print('Agent not found in database') session.close() "
Add Missing Agent Template:
# Add to agents_config.json or use database insertion python scripts/populate_agent_templates.py
2. Deep Analysis Failures
Problem: Deep analysis stops unexpectedly
β DeepAnalysisError: Agent execution failed at step 3
Solutions:
Check Agent Configuration:
# Verify user has required agents enabled python -c " from src.agents.deep_agents import get_user_enabled_agent_names from src.db.init_db import session_factory session = session_factory() agents = get_user_enabled_agent_names('test_user', session) required = ['preprocessing_agent', 'statistical_analytics_agent', 'sk_learn_agent', 'data_viz_agent'] print('Required agents:', required) print('Available agents:', agents) print('Missing:', [a for a in required if a not in agents]) session.close() "
Increase Timeout Settings:
# In deep_agents.py, increase timeout values timeout = 300 # Increase from default
Check Dataset Size:
# Reduce dataset size for complex analysis df_sample = df.sample(n=1000) # Use sample for testing
β‘ Code Execution Problems
1. Code Execution Timeouts
Problem: Code execution takes too long
β TimeoutError: Code execution exceeded 120 seconds
Solutions:
Optimize Generated Code:
- Use data sampling for large datasets
- Simplify analysis requirements
- Use sampling for large datasets
Check Resource Usage:
import psutil print(f"Memory usage: {psutil.virtual_memory().percent}%") print(f"CPU usage: {psutil.cpu_percent()}%")
Increase Timeout Settings:
# In clean_and_store_code function future.result(timeout=600) # Increase timeout to 10 minutes
Problem: Import Errors in Generated Code
β ModuleNotFoundError: No module named 'some_library'
Solutions:
Check Available Libraries:
# Available in execution environment: import pandas as pd import numpy as np import plotly.express as px import plotly.graph_objects as go import sklearn import statsmodels.api as sm
Add Missing Dependencies:
pip install missing_library
Update Execution Environment:
# In clean_and_store_code function exec_globals.update({ 'new_library': __import__('new_library') })
4. Database Issues
Problem: Migration Errors
β alembic.util.exc.CommandError: Can't locate revision identified by 'xyz'
Solutions:
Reset Migration History:
# Delete migration files (except __init__.py) rm migrations/versions/*.py # Create new initial migration alembic revision --autogenerate -m "initial migration" alembic upgrade head
Force Migration:
# Mark current state as up-to-date alembic stamp head
Recreate Database:
# For SQLite (development) rm auto_analyst.db python -c "from src.db.init_db import init_database; init_database()"
Problem: Constraint Violations
β IntegrityError: UNIQUE constraint failed
Solutions:
Check Existing Records:
from src.db.init_db import session_factory from src.db.schemas.models import AgentTemplate session = session_factory() templates = session.query(AgentTemplate).all() for t in templates: print(f"{t.template_name}: {t.template_id}") session.close()
Clean Duplicate Data:
python -c " from src.db.init_db import session_factory from src.db.schemas.models import AgentTemplate session = session_factory() # Remove duplicates based on template_name seen = set() for template in session.query(AgentTemplate).all(): if template.template_name in seen: session.delete(template) else: seen.add(template.template_name) session.commit() session.close() "
5. Authentication and Authorization Issues
Problem: Unauthorized Access
β 401 Unauthorized: Invalid session
Solutions:
Check Session ID:
# Ensure session_id is provided in request headers = {"X-Session-ID": "your_session_id"} # Or as query parameter: ?session_id=your_session_id
Create Valid Session:
curl -X POST "http://localhost:8000/session_info" \ -H "Content-Type: application/json"
Verify Admin API Key:
curl -X GET "http://localhost:8000/analytics/usage" \ -H "X-API-Key: your_admin_key"
6. Performance Issues
Problem: Slow Response Times
β οΈ Request taking longer than expected
Solutions:
Enable Database Connection Pooling:
# In init_db.py engine = create_engine( DATABASE_URL, poolclass=QueuePool, pool_size=10, max_overflow=20 )
Optimize Database Queries:
# Use eager loading for relationships session.query(User).options(joinedload(User.chats)).all()
Add Response Caching:
# Use local caching for expensive operations @lru_cache(maxsize=100) def expensive_operation(data): return result
Problem: Memory Usage High
β οΈ Memory usage above 80%
Solutions:
Optimize DataFrame Operations:
# Use chunking for large datasets for chunk in pd.read_csv('file.csv', chunksize=1000): process_chunk(chunk)
Clear Unused Variables:
# In code execution del large_dataframe import gc gc.collect()
Monitor Memory Usage:
import psutil import logging memory_percent = psutil.virtual_memory().percent if memory_percent > 80: logging.warning(f"High memory usage: {memory_percent}%")
π§ Debugging Tools and Commands
Health Check Commands
# Test basic connectivity
curl http://localhost:8000/health
# Check database status
python -c "
from src.db.init_db import session_factory
try:
session = session_factory()
session.execute('SELECT 1')
print('β
Database connection OK')
session.close()
except Exception as e:
print(f'β Database error: {e}')
"
# Verify agent templates
python -c "
from src.db.init_db import session_factory
from src.db.schemas.models import AgentTemplate
session = session_factory()
count = session.query(AgentTemplate).count()
print(f'Agent templates in database: {count}')
session.close()
"
Performance Monitoring
# Memory and CPU monitoring
import psutil
import time
def monitor_system():
while True:
cpu = psutil.cpu_percent(interval=1)
memory = psutil.virtual_memory()
print(f"CPU: {cpu}% | Memory: {memory.percent}% | Available: {memory.available // 1024 // 1024}MB")
time.sleep(5)
# Run monitoring
monitor_system()
Database Inspection
# Inspect database tables
from src.db.init_db import session_factory
from src.db.schemas.models import *
session = session_factory()
# Count records in each table
tables = [User, Chat, Message, AgentTemplate, UserTemplatePreference, DeepAnalysisReport]
for table in tables:
count = session.query(table).count()
print(f"{table.__name__}: {count} records")
session.close()
Log Analysis
# View recent logs
tail -f logs/app.log
# Search for errors
grep "ERROR" logs/app.log | tail -20
# Search for specific issues
grep -i "agent" logs/app.log | grep -i "error"
π Performance Optimization Tips
Database Optimization
- Use Indexes: Ensure frequently queried columns have indexes
- Query Optimization: Use
joinedload
for relationships - Connection Pooling: Configure appropriate pool sizes
- Batch Operations: Use bulk operations for multiple records
Agent Performance
- Async Execution: Use async patterns for concurrent operations
- Result Caching: Cache expensive computations
- Memory Management: Clean up large objects after use
- Code Optimization: Simplify generated code for better performance
System Monitoring
- Resource Tracking: Monitor CPU, memory, and disk usage
- Error Monitoring: Set up alerting for critical errors
- Performance Metrics: Track response times and throughput
- Usage Analytics: Monitor feature usage and optimization opportunities
This troubleshooting guide covers the most common issues you'll encounter with the Auto-Analyst backend. For additional help, check the system logs and use the debugging tools provided.