Spaces:

XcodeAddy
/

sentinel-env

Running

App Files Files Community

sentinel-env / docs /diagrams /VISUAL_SYSTEM.md

XcodeAddy

Add process-aware reward engine reports

b3b9bbd 18 days ago

preview code

raw

history blame contribute delete

4.95 kB

SENTINEL Visual System

This file is the diagram source of truth. Every diagram used in README, UI, blog, or slides should be derived from here.

Diagram Inventory

Diagram	Purpose	Status
System stack	show the code architecture	ready
Episode lifecycle	explain `reset()` to terminal reward	ready
Trust and reward flow	show how state turns into learning signal	ready
Reward engine v2	show process-aware reward components	ready
Before / after	show why SENTINEL matters	ready
Theme fit	map the project to the hackathon	ready
Training loop	show OpenEnv -> TRL / Unsloth pipeline	ready

1. System Stack

flowchart TD
  A["HTTP client / UI / inference.py"] --> B["app.py<br/>FastAPI on port 7860"]
  B --> C["SentinelEnv<br/>environment.py"]
  B --> D["_sessions<br/>session_id -> SentinelEnv"]
  C --> E["TaskGraph<br/>task_graph.py"]
  C --> F["TrustLedger<br/>trust_ledger.py"]
  C --> G["SpecialistPool<br/>specialists.py"]
  C --> H["RewardEngine<br/>graders.py"]
  C --> I["Scenario dataset<br/>scenarios.py"]
  C --> J["Typed models<br/>models.py"]
  B --> K["openenv.yaml"]
  B --> L["static/index.html"]

2. Episode Lifecycle

flowchart TD
  A["reset(task_type, seed)"] --> B["sample scenario"]
  B --> C["reshuffle hidden specialist profiles"]
  C --> D["set trust priors to 0.50"]
  D --> E["build task graph"]
  E --> F["return first observation"]

  F --> G["orchestrator chooses action"]
  G --> H["delegate / verify / self solve / skip"]
  H --> I["specialist or self execution"]
  I --> J["record outcome in TaskGraph"]
  J --> K["update TrustLedger"]
  K --> L["compute step reward"]
  L --> M{"done?"}
  M -- "no" --> N["return next observation"]
  N --> G
  M -- "yes" --> O["compute terminal reward"]
  O --> P["return done=True with final info"]

3. Trust And Reward Flow

flowchart LR
  A["Observation<br/>subtask, stakes, trust snapshot"] --> B["Action choice"]
  B --> C["Specialist result<br/>outcome, confidence, adversarial flag, step_cost"]
  C --> D["TaskGraph update"]
  C --> E["TrustLedger Bayesian update"]
  D --> F["completion, detections, poisonings"]
  E --> G["calibration state"]
  F --> H["RewardEngine"]
  G --> H
  H --> I["step reward"]
  H --> J["terminal reward"]

4. Reward Engine V2

flowchart LR
  A["Specialist result<br/>outcome, confidence, metadata"] --> B["Step reward"]
  C["TaskGraph<br/>completion, detections, poisonings"] --> D["Terminal reward"]
  E["TrustLedger<br/>calibration, fingerprints"] --> D

  B --> B1["task accuracy"]
  B --> B2["stakes awareness"]
  B --> B3["efficiency"]
  B --> B4["confidence alignment"]
  B --> B5["verification quality"]
  B --> B6["domain routing"]

  D --> D1["completion rate"]
  D --> D2["detection rate"]
  D --> D3["trust calibration"]
  D --> D4["episode efficiency"]

  B --> R["reward-report endpoint"]
  D --> R
  R --> T["component trace for judges"]

5. Before / After

flowchart LR
  subgraph BEFORE["Before SENTINEL"]
    A1["Uniform trust"] --> A2["Blind delegation"]
    A2 --> A3["Poison accepted at high stakes"]
    A3 --> A4["Downstream subtasks inherit bad state"]
    A4 --> A5["Mission drifts or fails"]
  end

  subgraph AFTER["After SENTINEL"]
    B1["Behavior updates trust"] --> B2["Low-trust high-stakes node detected"]
    B2 --> B3["Verify instead of delegate"]
    B3 --> B4["Poison blocked before cascade"]
    B4 --> B5["Mission completes cleanly"]
  end

6. Theme Fit

flowchart TD
  S["SENTINEL"] --> T1["Theme 1<br/>multi-agent interaction"]
  S --> T2["Theme 2<br/>long-horizon planning"]
  S --> T4["Theme 4<br/>self-improvement"]
  S --> T5["Theme 5<br/>wild card"]

  T1 --> B1["orchestrator + five specialists<br/>partial observability<br/>adversarial dynamics"]
  T2 --> B2["task graph<br/>step budget pressure<br/>delayed terminal reward"]
  T4 --> B3["profile reshuffle<br/>auto-curriculum<br/>no memorization"]
  T5 --> B4["real production weakness<br/>blind trust in agent pipelines"]

7. Training Loop

flowchart LR
  A["Prompt / observation"] --> B["Model rollout"]
  B --> C["Action text or structured action"]
  C --> D["SENTINEL environment"]
  D --> E["Reward + next observation"]
  E --> F["TRL / GRPO trainer"]
  F --> G["updated policy"]
  G --> B

  H["training/evaluate.py"] --> I["random / heuristic / oracle-lite"]
  I --> J["evaluation_results.json"]
  I --> K["baseline_comparison.png"]

Use Rules

Do not invent new component names in slide decks that do not exist in code.
Use SentinelEnv, TrustLedger, SpecialistPool, TaskGraph, RewardEngine consistently.
Use real baseline numbers in public before/after materials.
Export polished PNG versions from these mermaid sources later, but keep this file as the editable truth.