Spaces:

DataQuests
/

DeepCritical

Running

App Files Files Community

DeepCritical / docs /guides /deployment.md

VibecoderMcSwaggins

docs: update guides and add testing strategy documentation

b4aa4ad 17 days ago

preview code

raw

history blame

3.72 kB

Deployment Guide

Launching DeepCritical: Gradio, MCP, & Modal

Overview

DeepCritical is designed for a multi-platform deployment strategy to maximize hackathon impact:

HuggingFace Spaces: Host the Gradio UI (User Interface).
MCP Server: Expose research tools to Claude Desktop/Agents.
Modal (Optional): Run heavy inference or local LLMs if API costs are prohibitive.

1. HuggingFace Spaces (Gradio UI)

Goal: A public URL where judges/users can try the research agent.

Prerequisites

HuggingFace Account
gradio installed (uv add gradio)

Steps

Create Space:
- Go to HF Spaces -> Create New Space.
- SDK: Gradio.
- Hardware: CPU Basic (Free) is sufficient (since we use APIs).
Prepare Files:
- Ensure app.py contains the Gradio interface construction.
- Ensure requirements.txt or pyproject.toml lists all dependencies.
Secrets:
- Go to Space Settings -> Repository secrets.
- Add ANTHROPIC_API_KEY (or your chosen LLM provider key).
- Add BRAVE_API_KEY (for web search).
Deploy:
- Push code to the Space's git repo.
- Watch "Build" logs.

Streaming Optimization

Ensure app.py uses generator functions for the chat interface to prevent timeouts:

# app.py
def predict(message, history):
    agent = ResearchAgent()
    for update in agent.research_stream(message):
        yield update

2. MCP Server Deployment

Goal: Allow other agents (like Claude Desktop) to use our PubMed/Research tools directly.

Local Usage (Claude Desktop)

Install:
```
uv sync
```

Configure Claude Desktop: Edit ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "deepcritical": {
      "command": "uv",
      "args": ["run", "fastmcp", "run", "src/mcp_servers/pubmed_server.py"],
      "cwd": "/absolute/path/to/DeepCritical"
    }
  }
}

Restart Claude: You should see a 🔌 icon indicating connected tools.

Remote Deployment (Smithery/Glama)

Target for "MCP Track" bonus points.

Dockerize: Create a Dockerfile for the MCP server.

FROM python:3.11-slim
COPY . /app
RUN pip install fastmcp httpx
CMD ["fastmcp", "run", "src/mcp_servers/pubmed_server.py", "--transport", "sse"]

Note: Use SSE transport for remote/HTTP servers.

Deploy: Host on Fly.io or Railway.

3. Modal (GPU/Heavy Compute)

Goal: Run a local LLM (e.g., Llama-3-70B) or handle massive parallel searches if APIs are too slow/expensive.

Setup

Install: uv add modal
Auth: modal token new

Logic

Instead of calling Anthropic API, we call a Modal function:

# src/llm/modal_client.py
import modal

stub = modal.Stub("deepcritical-inference")

@stub.function(gpu="A100")
def generate_text(prompt: str):
    # Load vLLM or similar
    ...

When to use?

Hackathon Demo: Stick to Anthropic/OpenAI APIs for speed/reliability.
Production/Stretch: Use Modal if you hit rate limits or want to show off "Open Source Models" capability.

Deployment Checklist

Pre-Flight

Run pytest -m unit to ensure logic is sound.
Run pytest -m e2e (one pass) to verify APIs connect.
Check requirements.txt matches pyproject.toml.

Secrets Management

NEVER commit .env files.
Verify keys are added to HF Space settings.

Post-Launch

Test the live URL.
Verify "Stop" button in Gradio works (interrupts the agent).
Record a walkthrough video (crucial for hackathon submission).