arkai2025 commited on
Commit
e2fdf09
ยท
1 Parent(s): 1b9fea1

feat(execute): allow up to 15 recommendations

Browse files

Lift stage 3 cap via MAX_RECOMMENDATIONS so UI can render more actions.

Add prompt placeholder replacement and balance deploy vs move logic.

Refresh HUD copy and prompt docs with new tooling and tick pacing.

Files changed (4) hide show
  1. README.md +2 -2
  2. agent.py +88 -5
  3. app.py +3 -9
  4. prompts.yaml +2 -1
README.md CHANGED
@@ -70,7 +70,7 @@ tags:
70
 
71
  1. **Assess (Stage 1)** โ€“ Calls the analytic MCP tools to classify intensity, coverage, building threats, idle units, and threat level.
72
  2. **Plan (Stage 2)** โ€“ Chooses a strategy (`deploy_new`, `optimize_existing`, `balanced`, etc.) and determines how many actions are justified.
73
- 3. **Execute (Stage 3)** โ€“ Emits up to four JSON recommendations (deploy/move/remove) that the service can auto-execute or queue for the player.
74
  4. **Cycle Summary (Stage 4)** โ€“ Condenses every loop into a headline, highlights, risks, and next-focus bullets for the Gradio timeline.
75
  5. **After-Action Report** โ€“ Once the scenario ends, the agent merges all summaries + metrics into a โ€œbattle reportโ€ overlay with charts and actionable follow-ups.
76
  6. **Human-in-the-loop** โ€“ Players can pause, inspect reasoning, toggle auto-execute, or manually override/augment deployments at any point.
@@ -82,7 +82,7 @@ Prompts for every stage live in `prompts.yaml`, making it easy to retune instruc
82
  ## ๐ŸŽฎ Gameplay Loop & UI Experience
83
 
84
  - Start/reset from the control bar or open the accordion to tweak **fire count**, **intensity**, **building cluster size**, **max unit slots**, and **seed**.
85
- - The advisor refreshes roughly every 10 ticks (โ‰ˆ10 seconds). When it is โ€œthinking,โ€ the HUD animates, and the chat timeline streams stage-by-stage logs.
86
  - **Auto-Execute** default = ON. Turn it off to require manual approvals (or to stress-test AI reasoning while you handle deployments yourself).
87
  - Click any grid cell to deploy trucks/helis, remove an existing unit, or even **ignite a new fire** (`๐Ÿ”ฅ Fire` option) in sandbox mode.
88
  - Event log + player action chips record everything the human does (deploy, remove, ignite) for inclusion inside the after-action report.
 
70
 
71
  1. **Assess (Stage 1)** โ€“ Calls the analytic MCP tools to classify intensity, coverage, building threats, idle units, and threat level.
72
  2. **Plan (Stage 2)** โ€“ Chooses a strategy (`deploy_new`, `optimize_existing`, `balanced`, etc.) and determines how many actions are justified.
73
+ 3. **Execute (Stage 3)** โ€“ Emits JSON recommendations (deploy/move/remove) that the service can auto-execute or queue for the player.
74
  4. **Cycle Summary (Stage 4)** โ€“ Condenses every loop into a headline, highlights, risks, and next-focus bullets for the Gradio timeline.
75
  5. **After-Action Report** โ€“ Once the scenario ends, the agent merges all summaries + metrics into a โ€œbattle reportโ€ overlay with charts and actionable follow-ups.
76
  6. **Human-in-the-loop** โ€“ Players can pause, inspect reasoning, toggle auto-execute, or manually override/augment deployments at any point.
 
82
  ## ๐ŸŽฎ Gameplay Loop & UI Experience
83
 
84
  - Start/reset from the control bar or open the accordion to tweak **fire count**, **intensity**, **building cluster size**, **max unit slots**, and **seed**.
85
+ - The advisor refreshes roughly every 10 ticks. When it is โ€œthinking,โ€ the HUD animates, and the chat timeline streams stage-by-stage logs.
86
  - **Auto-Execute** default = ON. Turn it off to require manual approvals (or to stress-test AI reasoning while you handle deployments yourself).
87
  - Click any grid cell to deploy trucks/helis, remove an existing unit, or even **ignite a new fire** (`๐Ÿ”ฅ Fire` option) in sandbox mode.
88
  - Event log + player action chips record everything the human does (deploy, remove, ignite) for inclusion inside the after-action report.
agent.py CHANGED
@@ -48,9 +48,13 @@ def get_hf_token() -> str | None:
48
 
49
 
50
  # =============================================================================
51
- # Load Prompts from Configuration
52
  # =============================================================================
53
 
 
 
 
 
54
  def load_prompts() -> dict:
55
  """Load prompts from prompts.yaml configuration file."""
56
  prompts_path = Path(__file__).parent / "prompts.yaml"
@@ -59,7 +63,19 @@ def load_prompts() -> dict:
59
  return yaml.safe_load(f)
60
  return {}
61
 
62
- PROMPTS_CONFIG = load_prompts()
 
 
 
 
 
 
 
 
 
 
 
 
63
 
64
 
65
  # =============================================================================
@@ -757,7 +773,7 @@ Building positions: {json.dumps([(b["x"], b["y"]) for b in buildings[:15]])}
757
  INSTRUCTIONS:
758
  1. FIRST generate MOVE actions for ineffective units โ†’ move them to uncovered fires
759
  2. THEN generate DEPLOY actions if more units needed
760
- 3. Max 4 recommendations total
761
  4. Remember: deploy ADJACENT to fire (1-2 cells away), not ON the fire
762
 
763
  Output format:
@@ -779,7 +795,7 @@ Output format:
779
  building_positions = set((b["x"], b["y"]) for b in buildings)
780
  used_positions = set()
781
 
782
- for rec in raw_recs[:4]: # Limit to 4
783
  action = rec.get("action", "deploy")
784
  unit_type = rec.get("unit_type", "fire_truck")
785
  target = rec.get("target", {})
@@ -1032,7 +1048,74 @@ OUTPUT FORMAT:
1032
  action="deploy"
1033
  ))
1034
 
1035
- return recommendations[:4] # Cap at 4 for UI display
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1036
 
1037
  # =========================================================================
1038
  # After-Action Report
 
48
 
49
 
50
  # =============================================================================
51
+ # Stage 3 action cap + Prompt loading
52
  # =============================================================================
53
 
54
+ # Stage 3 UI uses MAX_RECOMMENDATIONS to control how many actions can be rendered
55
+ MAX_RECOMMENDATIONS = 15
56
+ PROMPT_PLACEHOLDERS = {"{{MAX_RECOMMENDATIONS}}": str(MAX_RECOMMENDATIONS)}
57
+
58
  def load_prompts() -> dict:
59
  """Load prompts from prompts.yaml configuration file."""
60
  prompts_path = Path(__file__).parent / "prompts.yaml"
 
63
  return yaml.safe_load(f)
64
  return {}
65
 
66
+ def _apply_prompt_placeholders(value):
67
+ """Recursively replace placeholder tokens inside prompt config."""
68
+ if isinstance(value, str):
69
+ for token, replacement in PROMPT_PLACEHOLDERS.items():
70
+ value = value.replace(token, replacement)
71
+ return value
72
+ if isinstance(value, dict):
73
+ return {k: _apply_prompt_placeholders(v) for k, v in value.items()}
74
+ if isinstance(value, list):
75
+ return [_apply_prompt_placeholders(item) for item in value]
76
+ return value
77
+
78
+ PROMPTS_CONFIG = _apply_prompt_placeholders(load_prompts())
79
 
80
 
81
  # =============================================================================
 
773
  INSTRUCTIONS:
774
  1. FIRST generate MOVE actions for ineffective units โ†’ move them to uncovered fires
775
  2. THEN generate DEPLOY actions if more units needed
776
+ 3. Max {MAX_RECOMMENDATIONS} recommendations total
777
  4. Remember: deploy ADJACENT to fire (1-2 cells away), not ON the fire
778
 
779
  Output format:
 
795
  building_positions = set((b["x"], b["y"]) for b in buildings)
796
  used_positions = set()
797
 
798
+ for rec in raw_recs[:MAX_RECOMMENDATIONS]: # Limit to UI capacity
799
  action = rec.get("action", "deploy")
800
  unit_type = rec.get("unit_type", "fire_truck")
801
  target = rec.get("target", {})
 
1048
  action="deploy"
1049
  ))
1050
 
1051
+ return self._prioritize_recommendations(
1052
+ recommendations,
1053
+ plan.deploy_count,
1054
+ smart_deploy_count,
1055
+ )
1056
+
1057
+ def _prioritize_recommendations(
1058
+ self,
1059
+ recommendations: list[Recommendation],
1060
+ plan_deploy_target: int,
1061
+ smart_deploy_target: int,
1062
+ max_actions: int = MAX_RECOMMENDATIONS,
1063
+ ) -> list[Recommendation]:
1064
+ """
1065
+ Ensure we return a balanced mix of move/deploy actions without exceeding UI limits.
1066
+ """
1067
+ if len(recommendations) <= max_actions:
1068
+ return recommendations
1069
+
1070
+ deploy_recs = [rec for rec in recommendations if rec.action == "deploy"]
1071
+ move_recs = [rec for rec in recommendations if rec.action == "move"]
1072
+ other_recs = [rec for rec in recommendations if rec.action not in ("deploy", "move")]
1073
+
1074
+ deploy_priority = 0
1075
+ if deploy_recs:
1076
+ deploy_priority = max(plan_deploy_target, smart_deploy_target, 1)
1077
+ move_priority = len(move_recs)
1078
+ other_priority = len(other_recs)
1079
+
1080
+ priority_pairs = []
1081
+ if deploy_priority:
1082
+ priority_pairs.append(("deploy", deploy_priority))
1083
+ if move_priority:
1084
+ priority_pairs.append(("move", move_priority))
1085
+ if other_priority:
1086
+ priority_pairs.append(("other", other_priority))
1087
+
1088
+ if not priority_pairs:
1089
+ return recommendations[:max_actions]
1090
+
1091
+ priority_pairs.sort(key=lambda item: item[1], reverse=True)
1092
+ ordered_types = [ptype for ptype, _ in priority_pairs]
1093
+
1094
+ # Ensure every action type gets a chance once primary priorities are exhausted
1095
+ for action_type in ("deploy", "move", "other"):
1096
+ if action_type not in ordered_types:
1097
+ ordered_types.append(action_type)
1098
+
1099
+ pools = {"deploy": deploy_recs, "move": move_recs, "other": other_recs}
1100
+ indices = {key: 0 for key in pools}
1101
+ selected: list[Recommendation] = []
1102
+
1103
+ while len(selected) < max_actions:
1104
+ added = False
1105
+ for action_type in ordered_types:
1106
+ pool = pools[action_type]
1107
+ idx = indices[action_type]
1108
+ if idx >= len(pool):
1109
+ continue
1110
+ selected.append(pool[idx])
1111
+ indices[action_type] += 1
1112
+ added = True
1113
+ if len(selected) >= max_actions:
1114
+ break
1115
+ if not added:
1116
+ break
1117
+
1118
+ return selected
1119
 
1120
  # =========================================================================
1121
  # After-Action Report
app.py CHANGED
@@ -1632,7 +1632,7 @@ def create_app() -> gr.Blocks:
1632
  - ๐Ÿš **Helicopter:** Wide coverage (25%), covers 2 tiles outward from its center โ€” best for large-area control
1633
  - **Settings & Controls:** Use the panel below to quickly tune scenario difficulty (fires, buildings, units, randomness) before sending the team in
1634
 
1635
- **๐Ÿ† Win:** Extinguish all fires | **๐Ÿ’€ Lose:** Building โ‰ค 50%
1636
  """, elem_classes=["how-to-play"])
1637
 
1638
  # Collapsible Controls Section
@@ -1640,7 +1640,7 @@ def create_app() -> gr.Blocks:
1640
  with gr.Row():
1641
  with gr.Column(scale=1):
1642
  fire_count = gr.Slider(
1643
- minimum=1, maximum=30, value=15, step=1,
1644
  label="๐Ÿ”ฅ Initial Fire Count",
1645
  info="Number of fire starting points (1-25)"
1646
  )
@@ -1688,14 +1688,8 @@ def create_app() -> gr.Blocks:
1688
  with gr.Column(scale=2, min_width=300):
1689
  service = get_service()
1690
  advisor_interval_ticks = getattr(service, "advisor_interval", 10)
1691
- tick_interval_seconds = getattr(service, "tick_interval", 1.0)
1692
- advisor_interval_seconds = advisor_interval_ticks * tick_interval_seconds
1693
- if isinstance(advisor_interval_seconds, float) and advisor_interval_seconds.is_integer():
1694
- advisor_interval_display = int(advisor_interval_seconds)
1695
- else:
1696
- advisor_interval_display = round(advisor_interval_seconds, 1)
1697
  gr.Markdown(
1698
- f"## ๐Ÿค– AI Tactical Advisor ยท (refreshes ~every {advisor_interval_display} seconds)"
1699
  )
1700
  auto_execute_toggle = gr.Checkbox(
1701
  label="๐ŸŽฎ Auto-Execute",
 
1632
  - ๐Ÿš **Helicopter:** Wide coverage (25%), covers 2 tiles outward from its center โ€” best for large-area control
1633
  - **Settings & Controls:** Use the panel below to quickly tune scenario difficulty (fires, buildings, units, randomness) before sending the team in
1634
 
1635
+ **๐Ÿ† Win:** Extinguish all fires | **๐Ÿ’€ Lose:** Building โ‰ค 50% or Tick โ‰ฅ 200 (time out)
1636
  """, elem_classes=["how-to-play"])
1637
 
1638
  # Collapsible Controls Section
 
1640
  with gr.Row():
1641
  with gr.Column(scale=1):
1642
  fire_count = gr.Slider(
1643
+ minimum=1, maximum=40, value=20, step=1,
1644
  label="๐Ÿ”ฅ Initial Fire Count",
1645
  info="Number of fire starting points (1-25)"
1646
  )
 
1688
  with gr.Column(scale=2, min_width=300):
1689
  service = get_service()
1690
  advisor_interval_ticks = getattr(service, "advisor_interval", 10)
 
 
 
 
 
 
1691
  gr.Markdown(
1692
+ f"## ๐Ÿค– AI Tactical Advisor ยท (refreshes every {advisor_interval_ticks} ticks)"
1693
  )
1694
  auto_execute_toggle = gr.Checkbox(
1695
  label="๐ŸŽฎ Auto-Execute",
prompts.yaml CHANGED
@@ -106,6 +106,8 @@ plan:
106
  - `find_uncovered_fires()` โ†’ Fires needing coverage
107
  - `find_idle_units()` โ†’ Units to reposition first
108
  - `analyze_coverage()` โ†’ Full tactical view
 
 
109
 
110
  STRATEGIC PRIORITIES (STRICT ORDER):
111
 
@@ -264,7 +266,6 @@ execute:
264
  RECOMMENDATION COUNT LOGIC:
265
  - recommendations = reposition_count + deploy_count
266
  - If uncovered building threats exist: add extra for EACH uncovered threat
267
- - Cap at 4 for UI display
268
 
269
  CRITICAL RULES:
270
  1. Deploy 1-2 cells ADJACENT to fire (not ON the fire)
 
106
  - `find_uncovered_fires()` โ†’ Fires needing coverage
107
  - `find_idle_units()` โ†’ Units to reposition first
108
  - `analyze_coverage()` โ†’ Full tactical view
109
+ - `remove_unit(x, y)` โ†’ Free useless/blocked units so slots can be redeployed
110
+ - `deploy_unit(unit_type, x, y)` โ†’ Insert new trucks/helis exactly where the plan decides
111
 
112
  STRATEGIC PRIORITIES (STRICT ORDER):
113
 
 
266
  RECOMMENDATION COUNT LOGIC:
267
  - recommendations = reposition_count + deploy_count
268
  - If uncovered building threats exist: add extra for EACH uncovered threat
 
269
 
270
  CRITICAL RULES:
271
  1. Deploy 1-2 cells ADJACENT to fire (not ON the fire)