Spaces:
Running
on
CPU Upgrade
Deep Analysis API Documentation
Overview
The Deep Analysis system provides advanced multi-agent analytical capabilities that automatically generate comprehensive reports based on user goals. The system uses DSPy (Declarative Self-improving Language Programs) to orchestrate multiple AI agents and create detailed analytical insights.
Key Features
- Multi-Agent Analysis: Orchestrates multiple specialized agents (preprocessing, statistical analysis, machine learning, visualization)
- Template Integration: Uses the user's active templates/agents for analysis
- Streaming Progress: Real-time progress updates during analysis execution
- Report Persistence: Stores complete analysis reports in database with metadata
- HTML Export: Generates downloadable HTML reports with visualizations
- Credit Tracking: Monitors token usage, costs, and credits consumed
Template Integration
The deep analysis system integrates with the user's active templates through the agent system:
- Agent Selection: Uses agents from the user's active template preferences (configured via
/templates
endpoints) - Default Agents: Falls back to system default agents if user hasn't configured preferences:
preprocessing
(both individual and planner variants)statistical_analytics
(both individual and planner variants)sk_learn
(both individual and planner variants)data_viz
(both individual and planner variants)
- Template Limits: Respects the 10-template limit for planner performance optimization
- Dynamic Planning: The planner automatically selects the most appropriate agents based on the analysis goal and available templates
Analysis Flow
The deep analysis process follows these steps:
- Question Generation (20% progress): Generates 5 targeted analytical questions based on the user's goal
- Planning (40% progress): Creates an optimized execution plan using available agents
- Agent Execution (60% progress): Executes analysis using user's active templates
- Code Synthesis (80% progress): Combines and optimizes code from all agents
- Code Execution (85% progress): Runs the synthesized analysis code
- Synthesis (90% progress): Synthesizes results into coherent insights
- Conclusion (100% progress): Generates final conclusions and recommendations
API Endpoints
Create Deep Analysis Report
POST /deep_analysis/reports
Creates a new deep analysis report in the database.
Request Body:
{
"report_uuid": "string",
"user_id": 123,
"goal": "Analyze customer churn patterns",
"status": "completed",
"deep_questions": "1. What factors...\n2. How does...",
"deep_plan": "{\n \"@preprocessing\": {\n \"create\": [...],\n \"use\": [...],\n \"instruction\": \"...\"\n }\n}",
"summaries": ["Agent summary 1", "Agent summary 2"],
"analysis_code": "import pandas as pd\n# Analysis code...",
"plotly_figures": [{"data": [...], "layout": {...}}],
"synthesis": ["Synthesis result 1"],
"final_conclusion": "## Conclusion\nThe analysis reveals...",
"html_report": "<html>...</html>",
"report_summary": "Brief summary of findings",
"progress_percentage": 100,
"duration_seconds": 120,
"credits_consumed": 5,
"error_message": null,
"model_provider": "anthropic",
"model_name": "claude-sonnet-4-20250514",
"total_tokens_used": 15000,
"estimated_cost": 0.25,
"steps_completed": ["questions", "planning", "execution", "synthesis", "conclusion"]
}
Response:
{
"report_id": 1,
"report_uuid": "uuid-string",
"user_id": 123,
"goal": "Analyze customer churn patterns",
"status": "completed",
"start_time": "2024-01-01T12:00:00Z",
"end_time": "2024-01-01T12:02:00Z",
"duration_seconds": 120,
"report_summary": "Brief summary of findings",
"created_at": "2024-01-01T12:02:00Z",
"updated_at": "2024-01-01T12:02:00Z"
}
Get Deep Analysis Reports
GET /deep_analysis/reports
Retrieves a list of deep analysis reports with optional filtering.
Query Parameters:
user_id
(optional): Filter by user IDlimit
(optional): Number of reports to return (1-100, default: 10)offset
(optional): Number of reports to skip (default: 0)status
(optional): Filter by status ("pending", "running", "completed", "failed")
Response:
[
{
"report_id": 1,
"report_uuid": "uuid-string",
"user_id": 123,
"goal": "Analyze customer churn patterns",
"status": "completed",
"start_time": "2024-01-01T12:00:00Z",
"end_time": "2024-01-01T12:02:00Z",
"duration_seconds": 120,
"report_summary": "Brief summary of findings",
"created_at": "2024-01-01T12:02:00Z",
"updated_at": "2024-01-01T12:02:00Z"
}
]
Get User Historical Reports
GET /deep_analysis/reports/user_historical
Retrieves all historical deep analysis reports for a specific user.
Query Parameters:
user_id
: User ID (required)limit
(optional): Number of reports to return (1-100, default: 50)
Get Report by ID
GET /deep_analysis/reports/{report_id}
Retrieves a complete deep analysis report by ID.
Query Parameters:
user_id
(optional): Ensures report belongs to specified user
Response:
{
"report_id": 1,
"report_uuid": "uuid-string",
"user_id": 123,
"goal": "Analyze customer churn patterns",
"status": "completed",
"start_time": "2024-01-01T12:00:00Z",
"end_time": "2024-01-01T12:02:00Z",
"duration_seconds": 120,
"deep_questions": "1. What factors contribute to churn?\n2. How does churn vary by segment?",
"deep_plan": "{\n \"@preprocessing\": {...},\n \"@statistical_analytics\": {...}\n}",
"summaries": ["Agent performed data cleaning...", "Statistical analysis revealed..."],
"analysis_code": "import pandas as pd\n# Complete analysis code",
"plotly_figures": [{"data": [...], "layout": {...}}],
"synthesis": ["The analysis shows clear patterns..."],
"final_conclusion": "## Conclusion\nCustomer churn is primarily driven by...",
"html_report": "<html>...</html>",
"report_summary": "Analysis of customer churn patterns reveals...",
"progress_percentage": 100,
"credits_consumed": 5,
"error_message": null,
"model_provider": "anthropic",
"model_name": "claude-sonnet-4-20250514",
"total_tokens_used": 15000,
"estimated_cost": 0.25,
"steps_completed": ["questions", "planning", "execution", "synthesis", "conclusion"],
"created_at": "2024-01-01T12:02:00Z",
"updated_at": "2024-01-01T12:02:00Z"
}
Get Report by UUID
GET /deep_analysis/reports/uuid/{report_uuid}
Retrieves a complete deep analysis report by UUID. Same response format as get by ID.
Delete Report
DELETE /deep_analysis/reports/{report_id}
Deletes a deep analysis report.
Query Parameters:
user_id
(optional): Ensures report belongs to specified user
Response:
{
"message": "Report 1 deleted successfully"
}
Update Report Status
PUT /deep_analysis/reports/{report_id}/status
Updates the status of a deep analysis report.
Request Body:
{
"status": "completed"
}
Valid Status Values:
pending
: Analysis queued but not startedrunning
: Analysis in progresscompleted
: Analysis finished successfullyfailed
: Analysis encountered errors
Get HTML Report
GET /deep_analysis/reports/uuid/{report_uuid}/html
Retrieves only the HTML report content for a specific analysis.
Query Parameters:
user_id
(optional): Ensures report belongs to specified user
Response:
{
"html_report": "<html>...</html>",
"filename": "deep_analysis_report_20240101_120200.html"
}
Download HTML Report
POST /deep_analysis/download_from_db/{report_uuid}
Downloads the HTML report as a file attachment.
Query Parameters:
user_id
(optional): Ensures report belongs to specified user
Response:
- Content-Type:
text/html; charset=utf-8
- Content-Disposition:
attachment; filename="deep_analysis_report_TIMESTAMP.html"
Deep Analysis Module Architecture
DSPy Signatures
The system uses several DSPy signatures for different analysis phases:
1. deep_questions
Generates 5 targeted analytical questions based on the user's goal and dataset structure.
2. deep_planner
Creates an optimized execution plan using the user's active templates/agents. The planner:
- Verifies feasibility using available datasets and agent descriptions
- Batches similar questions per agent call for efficiency
- Reuses outputs across questions to minimize agent calls
- Defines clear variable flow and dependencies between agents
3. deep_code_synthesizer
Combines and optimizes code from multiple agents:
- Fixes errors and inconsistencies between agent outputs
- Ensures proper data flow and type handling
- Converts all visualizations to Plotly format
- Adds comprehensive error handling and validation
4. deep_synthesizer
Synthesizes analysis results into coherent insights and findings.
5. final_conclusion
Generates final conclusions and strategic recommendations based on all analysis results.
Streaming Analysis
The execute_deep_analysis_streaming
method provides real-time progress updates:
async for update in deep_analysis.execute_deep_analysis_streaming(goal, dataset_info, session_df):
if update["step"] == "questions":
# Handle questions generation progress
elif update["step"] == "planning":
# Handle planning progress
elif update["step"] == "agent_execution":
# Handle agent execution progress
# ... handle other steps
Integration with User Templates
The deep analysis system integrates with user templates in several ways:
- Agent Discovery: Retrieves user's active template preferences from the database
- Dynamic Planning: The planner uses available agents to create optimal execution plans
- Template Validation: Ensures all referenced agents exist in the user's active templates
- Fallback Handling: Uses default agents if user preferences are incomplete
- Performance Optimization: Respects template limits for efficient execution
Error Handling
The system includes comprehensive error handling:
- Code Execution Errors: Automatically attempts to fix and retry failed code
- Template Missing: Falls back to default agents if user templates are unavailable
- Timeout Protection: Includes timeouts for long-running operations
- Memory Management: Handles large datasets and visualization efficiently
- Unicode Handling: Cleans problematic characters that might cause encoding issues
Visualization Integration
All visualizations are standardized to Plotly format:
- Consistent styling and color schemes
- Interactive features (zoom, pan, hover)
- Accessibility compliance (colorblind-friendly palettes)
- Export capabilities for reports
- Responsive design for different screen sizes
Frontend Integration
The deep analysis system includes React components for:
- DeepAnalysisSidebar: Main interface for starting and managing analyses
- NewAnalysisForm: Form for initiating new deep analyses
- CurrentAnalysisView: Real-time progress tracking during analysis
- HistoryView: Browse and access historical analysis reports
- AnalysisStep: Individual step progress visualization
The frontend integrates with the streaming API to provide real-time feedback and uses the user's active template configuration for personalized analysis capabilities.
Credit and Cost Tracking
The system tracks detailed usage metrics:
- Credits Consumed: Number of credits deducted from user account
- Token Usage: Total tokens used across all model calls
- Estimated Cost: Dollar cost estimate based on model pricing
- Model Information: Provider and model name used for analysis
- Execution Time: Duration of analysis for performance monitoring
This information helps users understand resource consumption and optimize their analysis strategies.