Deep Analysis API Documentation

Overview

The Deep Analysis system provides advanced multi-agent analytical capabilities that automatically generate comprehensive reports based on user goals. The system uses DSPy (Declarative Self-improving Language Programs) to orchestrate multiple AI agents and create detailed analytical insights.

Key Features

Multi-Agent Analysis: Orchestrates multiple specialized agents (preprocessing, statistical analysis, machine learning, visualization)
Template Integration: Uses the user's active templates/agents for analysis
Streaming Progress: Real-time progress updates during analysis execution
Report Persistence: Stores complete analysis reports in database with metadata
HTML Export: Generates downloadable HTML reports with visualizations
Credit Tracking: Monitors token usage, costs, and credits consumed

Template Integration

The deep analysis system integrates with the user's active templates through the agent system:

Agent Selection: Uses agents from the user's active template preferences (configured via /templates endpoints)
Default Agents: Falls back to system default agents if user hasn't configured preferences:
- preprocessing (both individual and planner variants)
- statistical_analytics (both individual and planner variants)
- sk_learn (both individual and planner variants)
- data_viz (both individual and planner variants)
Template Limits: Respects the 10-template limit for planner performance optimization
Dynamic Planning: The planner automatically selects the most appropriate agents based on the analysis goal and available templates

Analysis Flow

The deep analysis process follows these steps:

Question Generation (20% progress): Generates 5 targeted analytical questions based on the user's goal
Planning (40% progress): Creates an optimized execution plan using available agents
Agent Execution (60% progress): Executes analysis using user's active templates
Code Synthesis (80% progress): Combines and optimizes code from all agents
Code Execution (85% progress): Runs the synthesized analysis code
Synthesis (90% progress): Synthesizes results into coherent insights
Conclusion (100% progress): Generates final conclusions and recommendations

API Endpoints

Create Deep Analysis Report

POST /deep_analysis/reports

Creates a new deep analysis report in the database.

Request Body:

{
  "report_uuid": "string",
  "user_id": 123,
  "goal": "Analyze customer churn patterns",
  "status": "completed",
  "deep_questions": "1. What factors...\n2. How does...",
  "deep_plan": "{\n  \"@preprocessing\": {\n    \"create\": [...],\n    \"use\": [...],\n    \"instruction\": \"...\"\n  }\n}",
  "summaries": ["Agent summary 1", "Agent summary 2"],
  "analysis_code": "import pandas as pd\n# Analysis code...",
  "plotly_figures": [{"data": [...], "layout": {...}}],
  "synthesis": ["Synthesis result 1"],
  "final_conclusion": "## Conclusion\nThe analysis reveals...",
  "html_report": "<html>...</html>",
  "report_summary": "Brief summary of findings",
  "progress_percentage": 100,
  "duration_seconds": 120,
  "credits_consumed": 5,
  "error_message": null,
  "model_provider": "anthropic",
  "model_name": "claude-sonnet-4-20250514",
  "total_tokens_used": 15000,
  "estimated_cost": 0.25,
  "steps_completed": ["questions", "planning", "execution", "synthesis", "conclusion"]
}

Response:

{
  "report_id": 1,
  "report_uuid": "uuid-string",
  "user_id": 123,
  "goal": "Analyze customer churn patterns",
  "status": "completed",
  "start_time": "2024-01-01T12:00:00Z",
  "end_time": "2024-01-01T12:02:00Z",
  "duration_seconds": 120,
  "report_summary": "Brief summary of findings",
  "created_at": "2024-01-01T12:02:00Z",
  "updated_at": "2024-01-01T12:02:00Z"
}

Get Deep Analysis Reports

GET /deep_analysis/reports

Retrieves a list of deep analysis reports with optional filtering.

Query Parameters:

user_id (optional): Filter by user ID
limit (optional): Number of reports to return (1-100, default: 10)
offset (optional): Number of reports to skip (default: 0)
status (optional): Filter by status ("pending", "running", "completed", "failed")

Response:

[
  {
    "report_id": 1,
    "report_uuid": "uuid-string",
    "user_id": 123,
    "goal": "Analyze customer churn patterns",
    "status": "completed",
    "start_time": "2024-01-01T12:00:00Z",
    "end_time": "2024-01-01T12:02:00Z",
    "duration_seconds": 120,
    "report_summary": "Brief summary of findings",
    "created_at": "2024-01-01T12:02:00Z",
    "updated_at": "2024-01-01T12:02:00Z"
  }
]

Get User Historical Reports

GET /deep_analysis/reports/user_historical

Retrieves all historical deep analysis reports for a specific user.

Query Parameters:

user_id: User ID (required)
limit (optional): Number of reports to return (1-100, default: 50)

Get Report by ID

GET /deep_analysis/reports/{report_id}

Retrieves a complete deep analysis report by ID.

Query Parameters:

user_id (optional): Ensures report belongs to specified user

Response:

{
  "report_id": 1,
  "report_uuid": "uuid-string",
  "user_id": 123,
  "goal": "Analyze customer churn patterns",
  "status": "completed",
  "start_time": "2024-01-01T12:00:00Z",
  "end_time": "2024-01-01T12:02:00Z",
  "duration_seconds": 120,
  "deep_questions": "1. What factors contribute to churn?\n2. How does churn vary by segment?",
  "deep_plan": "{\n  \"@preprocessing\": {...},\n  \"@statistical_analytics\": {...}\n}",
  "summaries": ["Agent performed data cleaning...", "Statistical analysis revealed..."],
  "analysis_code": "import pandas as pd\n# Complete analysis code",
  "plotly_figures": [{"data": [...], "layout": {...}}],
  "synthesis": ["The analysis shows clear patterns..."],
  "final_conclusion": "## Conclusion\nCustomer churn is primarily driven by...",
  "html_report": "<html>...</html>",
  "report_summary": "Analysis of customer churn patterns reveals...",
  "progress_percentage": 100,
  "credits_consumed": 5,
  "error_message": null,
  "model_provider": "anthropic",
  "model_name": "claude-sonnet-4-20250514",
  "total_tokens_used": 15000,
  "estimated_cost": 0.25,
  "steps_completed": ["questions", "planning", "execution", "synthesis", "conclusion"],
  "created_at": "2024-01-01T12:02:00Z",
  "updated_at": "2024-01-01T12:02:00Z"
}

Get Report by UUID

GET /deep_analysis/reports/uuid/{report_uuid}

Retrieves a complete deep analysis report by UUID. Same response format as get by ID.

Delete Report

DELETE /deep_analysis/reports/{report_id}

Deletes a deep analysis report.

Query Parameters:

user_id (optional): Ensures report belongs to specified user

Response:

{
  "message": "Report 1 deleted successfully"
}

Update Report Status

PUT /deep_analysis/reports/{report_id}/status

Updates the status of a deep analysis report.

Request Body:

{
  "status": "completed"
}

Valid Status Values:

pending: Analysis queued but not started
running: Analysis in progress
completed: Analysis finished successfully
failed: Analysis encountered errors

Get HTML Report

GET /deep_analysis/reports/uuid/{report_uuid}/html

Retrieves only the HTML report content for a specific analysis.

Query Parameters:

user_id (optional): Ensures report belongs to specified user

Response:

{
  "html_report": "<html>...</html>",
  "filename": "deep_analysis_report_20240101_120200.html"
}

Download HTML Report

POST /deep_analysis/download_from_db/{report_uuid}

Downloads the HTML report as a file attachment.

Query Parameters:

user_id (optional): Ensures report belongs to specified user

Response:

Content-Type: text/html; charset=utf-8
Content-Disposition: attachment; filename="deep_analysis_report_TIMESTAMP.html"

Deep Analysis Module Architecture

DSPy Signatures

The system uses several DSPy signatures for different analysis phases:

1. `deep_questions`

Generates 5 targeted analytical questions based on the user's goal and dataset structure.

2. `deep_planner`

Creates an optimized execution plan using the user's active templates/agents. The planner:

Verifies feasibility using available datasets and agent descriptions
Batches similar questions per agent call for efficiency
Reuses outputs across questions to minimize agent calls
Defines clear variable flow and dependencies between agents

3. `deep_code_synthesizer`

Combines and optimizes code from multiple agents:

Fixes errors and inconsistencies between agent outputs
Ensures proper data flow and type handling
Converts all visualizations to Plotly format
Adds comprehensive error handling and validation

4. `deep_synthesizer`

Synthesizes analysis results into coherent insights and findings.

5. `final_conclusion`

Generates final conclusions and strategic recommendations based on all analysis results.

Streaming Analysis

The execute_deep_analysis_streaming method provides real-time progress updates:

async for update in deep_analysis.execute_deep_analysis_streaming(goal, dataset_info, session_df):
    if update["step"] == "questions":
        # Handle questions generation progress
    elif update["step"] == "planning":
        # Handle planning progress
    elif update["step"] == "agent_execution":
        # Handle agent execution progress
    # ... handle other steps

Integration with User Templates

The deep analysis system integrates with user templates in several ways:

Agent Discovery: Retrieves user's active template preferences from the database
Dynamic Planning: The planner uses available agents to create optimal execution plans
Template Validation: Ensures all referenced agents exist in the user's active templates
Fallback Handling: Uses default agents if user preferences are incomplete
Performance Optimization: Respects template limits for efficient execution

Error Handling

The system includes comprehensive error handling:

Code Execution Errors: Automatically attempts to fix and retry failed code
Template Missing: Falls back to default agents if user templates are unavailable
Timeout Protection: Includes timeouts for long-running operations
Memory Management: Handles large datasets and visualization efficiently
Unicode Handling: Cleans problematic characters that might cause encoding issues

Visualization Integration

All visualizations are standardized to Plotly format:

Consistent styling and color schemes
Interactive features (zoom, pan, hover)
Accessibility compliance (colorblind-friendly palettes)
Export capabilities for reports
Responsive design for different screen sizes

Frontend Integration

The deep analysis system includes React components for:

DeepAnalysisSidebar: Main interface for starting and managing analyses
NewAnalysisForm: Form for initiating new deep analyses
CurrentAnalysisView: Real-time progress tracking during analysis
HistoryView: Browse and access historical analysis reports
AnalysisStep: Individual step progress visualization

The frontend integrates with the streaming API to provide real-time feedback and uses the user's active template configuration for personalized analysis capabilities.

Credit and Cost Tracking

The system tracks detailed usage metrics:

Credits Consumed: Number of credits deducted from user account
Token Usage: Total tokens used across all model calls
Estimated Cost: Dollar cost estimate based on model pricing
Model Information: Provider and model name used for analysis
Execution Time: Duration of analysis for performance monitoring

This information helps users understand resource consumption and optimize their analysis strategies.

Deep Analysis API Documentation

Overview

Key Features

Template Integration

Analysis Flow

API Endpoints

Create Deep Analysis Report

Get Deep Analysis Reports

Get User Historical Reports

Get Report by ID

Get Report by UUID

Delete Report

Update Report Status

Get HTML Report

Download HTML Report

Deep Analysis Module Architecture

DSPy Signatures

1. deep_questions

2. deep_planner

3. deep_code_synthesizer

4. deep_synthesizer

5. final_conclusion

Streaming Analysis

Integration with User Templates

Error Handling

Visualization Integration

Frontend Integration

Credit and Cost Tracking

1. `deep_questions`

2. `deep_planner`

3. `deep_code_synthesizer`

4. `deep_synthesizer`

5. `final_conclusion`