Spaces:
Running
Running
ReplicaLab Project Map
Living reference of every module, class, function, and relationship. Updated after each implementation session.
Last updated: 2026-03-07 (JDG 01-03 scoring implemented)
Module Index
| File | What it covers |
|---|---|
| models.md | Data contracts β actions, observations, protocol, reward, episode state |
| scenarios.md | Scenario generation β templates, constraints, resources, hidden specs |
| agents.md | Agent policies β scientist prompt/parse/retry, lab manager feasibility/suggest/compose |
| validation.md | Protocol validation β deterministic checks against scenario constraints |
| scoring.md | Judge scoring β rigor, feasibility, fidelity |
| server.md | FastAPI server β REST + WebSocket endpoints, stub environment |
| frontend.md | React UI β dashboard, episode viewer, components |
| config.md | Shared constants β rounds, budget, timeouts |
| tests.md | Test coverage β 87 tests across 6 files |
Dependency Graph
server/app.py
βββ replicalab.config
βββ replicalab.models
βββ replicalab.scenarios (generate_scenario, available_scenario_families)
βββ replicalab.agents (check_feasibility, suggest_alternative, compose_lab_manager_response)
replicalab/agents/scientist_policy.py
βββ replicalab.models (ScientistAction, ScientistObservation, Protocol, ConversationEntry)
βββ replicalab.scenarios (NormalizedScenarioPack)
replicalab/agents/lab_manager_policy.py
βββ replicalab.models (LabManagerAction, LabManagerActionType, Protocol)
βββ replicalab.scenarios (NormalizedScenarioPack)
βββ replicalab.utils.validation (ValidationResult, validate_protocol)
replicalab/scenarios/templates.py
βββ replicalab.config (MAX_BUDGET, MAX_ROUNDS)
βββ replicalab.models (ScientistObservation, LabManagerObservation)
βββ replicalab.scenarios.{math_reasoning, ml_benchmark, finance_trading}
βββ replicalab.utils.seed (seed_rng)
replicalab/utils/validation.py
βββ replicalab.models (Protocol)
βββ replicalab.scenarios.templates (NormalizedScenarioPack)
replicalab/scoring/
βββ replicalab.models (Protocol, RewardBreakdown)
βββ replicalab.scenarios (NormalizedScenarioPack, HiddenReferenceSpec)
βββ replicalab.agents.lab_manager_policy (check_feasibility, FeasibilityCheckResult)
βββ replicalab.utils.text (element_tokens, normalize_label)
File Tree (implemented only)
replicalab/
βββ __init__.py (empty)
βββ config.py (shared constants)
βββ models.py (25 classes β all data contracts)
βββ agents/
β βββ __init__.py (re-exports from submodules)
β βββ scientist_policy.py (AGT 01-04: prompt, formatter, parser, retry, baseline)
β βββ lab_manager_policy.py(AGT 05-07: feasibility, suggest, compose)
βββ scenarios/
β βββ __init__.py (re-exports from templates)
β βββ templates.py (NormalizedScenarioPack, generate_scenario, apply_difficulty)
β βββ math_reasoning.py (2 cases: Cauchy-Schwarz, Jensen's inequality)
β βββ ml_benchmark.py (2 cases: AG News TinyBERT, CIFAR-10 ResNet-18)
β βββ finance_trading.py (2 cases: SPY/QQQ mean-reversion, momentum futures)
βββ scoring/
β βββ __init__.py (exports score_rigor, score_feasibility, score_fidelity)
β βββ rigor.py (JDG 01: structural quality + criteria coverage)
β βββ feasibility.py (JDG 02: wraps FeasibilityCheckResult with partial credit)
β βββ fidelity.py (JDG 03: substitution-aware hidden spec alignment)
βββ utils/
βββ seed.py (deterministic RNG from SHA256)
βββ text.py (shared token matching: normalize_label, element_tokens)
βββ validation.py (MOD 05: protocol validation, 5 checks)
server/
βββ app.py (FastAPI + WebSocket + _StubEnv)
frontend/
βββ package.json (React 19, Three.js, Framer Motion, Recharts, Tailwind)
βββ src/
β βββ App.tsx (router: /, /episode, /episode/:id)
β βββ types/index.ts (TypeScript interfaces mirroring Python models)
β βββ lib/
β β βββ api.ts (REST + WebSocket client + mock data generators)
β β βββ audio.ts (audio utilities)
β β βββ utils.ts (shared helpers)
β βββ components/ (15 React components)
β βββ pages/ (DashboardPage, EpisodePage)
βββ vite.config.ts
tests/
βββ test_config.py (3 tests)
βββ test_models.py (15 tests)
βββ test_scenarios.py (8 tests)
βββ test_validation.py (13 tests)
βββ test_scientist_policy.py (18 tests)
βββ test_lab_manager_policy.py(13 tests)
βββ test_reward.py (18 tests β JDG 01-03 scoring)
βββ test_server.py (5 tests β API endpoints)
Task Completion Status
| Area | Done | Remaining | Key gaps |
|---|---|---|---|
| Models (MOD) | MOD 01-05, 09, 11-12 | MOD 06 | Semantic validators for impossible plans |
| Scenarios (SCN) | SCN 01-12 | SCN 13 | Booking/scheduling data model |
| Agents (AGT) | AGT 01-07, 11 | AGT 08-10 | LLM-backed scientist, model selection |
| Judge (JDG) | JDG 01-03 | JDG 04-08 | Reward composition, bonuses, penalties |
| Environment (ENV) | β | ENV 01-11 | Entire real environment |
| Server (API) | API 01-04, 06 (partial) | API 05, 07-10 | Replay, auth, rate limiting |
| Frontend (FND) | FND 01-10 | β | Complete |
| Training (TRN) | β | TRN 01-18 | Entire RL pipeline |