Spaces:
Running
Running
Phase 1 Implementation Spec: Foundation & Tooling
Goal: Establish a "Gucci Banger" development environment using 2025 best practices. Philosophy: "If the build isn't solid, the agent won't be."
1. Prerequisites
Before starting, ensure these are installed:
# Install uv (Rust-based package manager)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Verify
uv --version # Should be >= 0.4.0
2. Project Initialization
# From project root
uv init --name deepcritical
uv python install 3.11 # Pin Python version
3. The Tooling Stack (Exact Dependencies)
pyproject.toml (Complete, Copy-Paste Ready)
[project]
name = "deepcritical"
version = "0.1.0"
description = "AI-Native Drug Repurposing Research Agent"
readme = "README.md"
requires-python = ">=3.11"
dependencies = [
# Core
"pydantic>=2.7",
"pydantic-settings>=2.2", # For BaseSettings (config)
"pydantic-ai>=0.0.16", # Agent framework
# HTTP & Parsing
"httpx>=0.27", # Async HTTP client
"beautifulsoup4>=4.12", # HTML parsing
"xmltodict>=0.13", # PubMed XML -> dict
# Search
"duckduckgo-search>=6.0", # Free web search
# UI
"gradio>=5.0", # Chat interface
# Utils
"python-dotenv>=1.0", # .env loading
"tenacity>=8.2", # Retry logic
"structlog>=24.1", # Structured logging
]
[project.optional-dependencies]
dev = [
# Testing
"pytest>=8.0",
"pytest-asyncio>=0.23",
"pytest-sugar>=1.0",
"pytest-cov>=5.0",
"pytest-mock>=3.12",
"respx>=0.21", # Mock httpx requests
# Quality
"ruff>=0.4.0",
"mypy>=1.10",
"pre-commit>=3.7",
]
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[tool.hatch.build.targets.wheel]
packages = ["src"]
# ============== RUFF CONFIG ==============
[tool.ruff]
line-length = 100
target-version = "py311"
src = ["src", "tests"]
[tool.ruff.lint]
select = [
"E", # pycodestyle errors
"F", # pyflakes
"B", # flake8-bugbear
"I", # isort
"N", # pep8-naming
"UP", # pyupgrade
"PL", # pylint
"RUF", # ruff-specific
]
ignore = [
"PLR0913", # Too many arguments (agents need many params)
]
[tool.ruff.lint.isort]
known-first-party = ["src"]
# ============== MYPY CONFIG ==============
[tool.mypy]
python_version = "3.11"
strict = true
ignore_missing_imports = true
disallow_untyped_defs = true
warn_return_any = true
warn_unused_ignores = true
# ============== PYTEST CONFIG ==============
[tool.pytest.ini_options]
testpaths = ["tests"]
asyncio_mode = "auto"
addopts = [
"-v",
"--tb=short",
"--strict-markers",
]
markers = [
"unit: Unit tests (mocked)",
"integration: Integration tests (real APIs)",
"slow: Slow tests",
]
# ============== COVERAGE CONFIG ==============
[tool.coverage.run]
source = ["src"]
omit = ["*/__init__.py"]
[tool.coverage.report]
exclude_lines = [
"pragma: no cover",
"if TYPE_CHECKING:",
"raise NotImplementedError",
]
4. Directory Structure (Maintainer's Structure)
# Execute these commands to create the directory structure
mkdir -p src/utils
mkdir -p src/tools
mkdir -p src/prompts
mkdir -p src/agent_factory
mkdir -p src/middleware
mkdir -p src/database_services
mkdir -p src/retrieval_factory
mkdir -p tests/unit/tools
mkdir -p tests/unit/agent_factory
mkdir -p tests/unit/utils
mkdir -p tests/integration
# Create __init__.py files (required for imports)
touch src/__init__.py
touch src/utils/__init__.py
touch src/tools/__init__.py
touch src/prompts/__init__.py
touch src/agent_factory/__init__.py
touch tests/__init__.py
touch tests/unit/__init__.py
touch tests/unit/tools/__init__.py
touch tests/unit/agent_factory/__init__.py
touch tests/unit/utils/__init__.py
touch tests/integration/__init__.py
Final Structure:
src/
βββ __init__.py
βββ app.py # Entry point (Gradio UI)
βββ orchestrator.py # Agent loop
βββ agent_factory/ # Agent creation and judges
β βββ __init__.py
β βββ agents.py
β βββ judges.py
βββ tools/ # Search tools
β βββ __init__.py
β βββ pubmed.py
β βββ websearch.py
β βββ search_handler.py
βββ prompts/ # Prompt templates
β βββ __init__.py
β βββ judge.py
βββ utils/ # Shared utilities
β βββ __init__.py
β βββ config.py
β βββ exceptions.py
β βββ models.py
β βββ dataloaders.py
β βββ parsers.py
βββ middleware/ # (Future)
βββ database_services/ # (Future)
βββ retrieval_factory/ # (Future)
tests/
βββ __init__.py
βββ conftest.py
βββ unit/
β βββ __init__.py
β βββ tools/
β β βββ __init__.py
β β βββ test_pubmed.py
β β βββ test_websearch.py
β β βββ test_search_handler.py
β βββ agent_factory/
β β βββ __init__.py
β β βββ test_judges.py
β βββ utils/
β β βββ __init__.py
β β βββ test_config.py
β βββ test_orchestrator.py
βββ integration/
βββ __init__.py
βββ test_pubmed_live.py
5. Configuration Files
.env.example (Copy to .env and fill)
# LLM Provider (choose one)
OPENAI_API_KEY=sk-your-key-here
ANTHROPIC_API_KEY=sk-ant-your-key-here
# Optional: PubMed API key (higher rate limits)
NCBI_API_KEY=your-ncbi-key-here
# Optional: For HuggingFace deployment
HF_TOKEN=hf_your-token-here
# Agent Config
MAX_ITERATIONS=10
LOG_LEVEL=INFO
.pre-commit-config.yaml
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.4.4
hooks:
- id: ruff
args: [--fix]
- id: ruff-format
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.10.0
hooks:
- id: mypy
additional_dependencies:
- pydantic>=2.7
- pydantic-settings>=2.2
args: [--ignore-missing-imports]
tests/conftest.py (Pytest Fixtures)
"""Shared pytest fixtures for all tests."""
import pytest
from unittest.mock import AsyncMock
@pytest.fixture
def mock_httpx_client(mocker):
"""Mock httpx.AsyncClient for API tests."""
mock = mocker.patch("httpx.AsyncClient")
mock.return_value.__aenter__ = AsyncMock(return_value=mock.return_value)
mock.return_value.__aexit__ = AsyncMock(return_value=None)
return mock
@pytest.fixture
def mock_llm_response():
"""Factory fixture for mocking LLM responses."""
def _mock(content: str):
return AsyncMock(return_value=content)
return _mock
@pytest.fixture
def sample_evidence():
"""Sample Evidence objects for testing."""
from src.utils.models import Evidence, Citation
return [
Evidence(
content="Metformin shows promise in Alzheimer's...",
citation=Citation(
source="pubmed",
title="Metformin and Alzheimer's Disease",
url="https://pubmed.ncbi.nlm.nih.gov/12345678/",
date="2024-01-15"
),
relevance=0.85
)
]
6. Core Utilities Implementation
src/utils/config.py
"""Application configuration using Pydantic Settings."""
from pydantic_settings import BaseSettings, SettingsConfigDict
from pydantic import Field
from typing import Literal
import structlog
class Settings(BaseSettings):
"""Strongly-typed application settings."""
model_config = SettingsConfigDict(
env_file=".env",
env_file_encoding="utf-8",
case_sensitive=False,
extra="ignore",
)
# LLM Configuration
openai_api_key: str | None = Field(default=None, description="OpenAI API key")
anthropic_api_key: str | None = Field(default=None, description="Anthropic API key")
llm_provider: Literal["openai", "anthropic"] = Field(
default="openai",
description="Which LLM provider to use"
)
openai_model: str = Field(default="gpt-4o", description="OpenAI model name")
anthropic_model: str = Field(default="claude-3-5-sonnet-20241022", description="Anthropic model")
# PubMed Configuration
ncbi_api_key: str | None = Field(default=None, description="NCBI API key for higher rate limits")
# Agent Configuration
max_iterations: int = Field(default=10, ge=1, le=50)
search_timeout: int = Field(default=30, description="Seconds to wait for search")
# Logging
log_level: Literal["DEBUG", "INFO", "WARNING", "ERROR"] = "INFO"
def get_api_key(self) -> str:
"""Get the API key for the configured provider."""
if self.llm_provider == "openai":
if not self.openai_api_key:
raise ValueError("OPENAI_API_KEY not set")
return self.openai_api_key
else:
if not self.anthropic_api_key:
raise ValueError("ANTHROPIC_API_KEY not set")
return self.anthropic_api_key
def get_settings() -> Settings:
"""Factory function to get settings (allows mocking in tests)."""
return Settings()
def configure_logging(settings: Settings) -> None:
"""Configure structured logging."""
structlog.configure(
processors=[
structlog.stdlib.filter_by_level,
structlog.stdlib.add_logger_name,
structlog.stdlib.add_log_level,
structlog.processors.TimeStamper(fmt="iso"),
structlog.processors.JSONRenderer(),
],
wrapper_class=structlog.stdlib.BoundLogger,
context_class=dict,
logger_factory=structlog.stdlib.LoggerFactory(),
)
# Singleton for easy import
settings = get_settings()
src/utils/exceptions.py
"""Custom exceptions for DeepCritical."""
class DeepCriticalError(Exception):
"""Base exception for all DeepCritical errors."""
pass
class SearchError(DeepCriticalError):
"""Raised when a search operation fails."""
pass
class JudgeError(DeepCriticalError):
"""Raised when the judge fails to assess evidence."""
pass
class ConfigurationError(DeepCriticalError):
"""Raised when configuration is invalid."""
pass
class RateLimitError(SearchError):
"""Raised when we hit API rate limits."""
pass
7. TDD Workflow: First Test
tests/unit/utils/test_config.py
"""Unit tests for configuration loading."""
import pytest
from unittest.mock import patch
import os
class TestSettings:
"""Tests for Settings class."""
def test_default_max_iterations(self):
"""Settings should have default max_iterations of 10."""
from src.utils.config import Settings
# Clear any env vars
with patch.dict(os.environ, {}, clear=True):
settings = Settings()
assert settings.max_iterations == 10
def test_max_iterations_from_env(self):
"""Settings should read MAX_ITERATIONS from env."""
from src.utils.config import Settings
with patch.dict(os.environ, {"MAX_ITERATIONS": "25"}):
settings = Settings()
assert settings.max_iterations == 25
def test_invalid_max_iterations_raises(self):
"""Settings should reject invalid max_iterations."""
from src.utils.config import Settings
from pydantic import ValidationError
with patch.dict(os.environ, {"MAX_ITERATIONS": "100"}):
with pytest.raises(ValidationError):
Settings() # 100 > 50 (max)
def test_get_api_key_openai(self):
"""get_api_key should return OpenAI key when provider is openai."""
from src.utils.config import Settings
with patch.dict(os.environ, {
"LLM_PROVIDER": "openai",
"OPENAI_API_KEY": "sk-test-key"
}):
settings = Settings()
assert settings.get_api_key() == "sk-test-key"
def test_get_api_key_missing_raises(self):
"""get_api_key should raise when key is not set."""
from src.utils.config import Settings
with patch.dict(os.environ, {"LLM_PROVIDER": "openai"}, clear=True):
settings = Settings()
with pytest.raises(ValueError, match="OPENAI_API_KEY not set"):
settings.get_api_key()
8. Makefile (Developer Experience)
Create a Makefile for standard devex commands:
.PHONY: install test lint format typecheck check clean
install:
uv sync --all-extras
uv run pre-commit install
test:
uv run pytest tests/unit/ -v
test-cov:
uv run pytest --cov=src --cov-report=term-missing
lint:
uv run ruff check src tests
format:
uv run ruff format src tests
typecheck:
uv run mypy src
check: lint typecheck test
@echo "All checks passed!"
clean:
rm -rf .pytest_cache .mypy_cache .ruff_cache __pycache__ .coverage
find . -type d -name "__pycache__" -exec rm -rf {} + 2>/dev/null || true
9. Execution Commands
# Install all dependencies
uv sync --all-extras
# Run tests (should pass after implementing config.py)
uv run pytest tests/unit/utils/test_config.py -v
# Run full test suite with coverage
uv run pytest --cov=src --cov-report=term-missing
# Run linting
uv run ruff check src tests
uv run ruff format src tests
# Run type checking
uv run mypy src
# Set up pre-commit hooks
uv run pre-commit install
10. Implementation Checklist
- Install
uvand verify version - Run
uv init --name deepcritical - Create
pyproject.toml(copy from above) - Create directory structure (run mkdir commands)
- Create
.env.exampleand.env - Create
.pre-commit-config.yaml - Create
Makefile(copy from above) - Create
tests/conftest.py - Implement
src/utils/config.py - Implement
src/utils/exceptions.py - Write tests in
tests/unit/utils/test_config.py - Run
make install - Run
make checkβ ALL CHECKS MUST PASS - Commit:
git commit -m "feat: phase 1 foundation complete"
11. Definition of Done
Phase 1 is COMPLETE when:
uv run pytestpasses with 100% of tests greenuv run ruff check src testshas 0 errorsuv run mypy srchas 0 errors- Pre-commit hooks are installed and working
from src.utils.config import settingsworks in Python REPL
Proceed to Phase 2 ONLY after all checkboxes are complete.