vaibhav12332112312 commited on
Commit
97ee7e7
·
1 Parent(s): e2c547b
README.md CHANGED
@@ -93,6 +93,33 @@ Tiered from [Buffer 2.1M study](https://buffer.com/resources/how-often-to-post-o
93
  | `monthly_strategic` | Medium | + tag discovery/exploitation + energy + consistency |
94
  | `monthly_competitive` | Hard | + growth vs competitors + differentiation + content diversity |
95
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
96
  ## Tool catalog
97
 
98
  | Tool | Cost | Returns |
 
93
  | `monthly_strategic` | Medium | + tag discovery/exploitation + energy + consistency |
94
  | `monthly_competitive` | Hard | + growth vs competitors + differentiation + content diversity |
95
 
96
+ ## Regulator/Judge Mode (per-day audit)
97
+
98
+ Every day the env emits a deterministic, explainable `JudgeReport` on the observation:
99
+
100
+ ```python
101
+ JudgeReport(
102
+ policy_compliance=1.00, # 1.0 - sum(weighted_violations); see _compute_judge_report
103
+ sustainability_risk=0.10, # 0.4*(1-energy_min) + 0.3*sleep_debt + 0.3*low_energy_ratio
104
+ strategic_quality=0.96, # 0.4*engagement_per_post + 0.3*intent_diversity + 0.3*format_diversity
105
+ explanation="compliance=1.00 risk=0.10 strategy=0.96 | no policy violations",
106
+ violations=[], # human-readable rule breaks (Buffer 2.1M, Van Dongen, Cen 2024)
107
+ )
108
+ ```
109
+
110
+ Auditable rules (all sourced): >5 posts/day → fatigue cliff (Buffer 2.1M); >7 posts/week → weekly cap; ≥4 collabs/month → diminishing returns (Cen 2024); >22h awake → sleep debt (Van Dongen 2003).
111
+
112
+ ## Headline metrics (final-step audit)
113
+
114
+ The final observation carries `HeadlineMetrics` with the three numbers judges remember:
115
+
116
+ | Metric | What it measures | Source of truth |
117
+ |---|---|---|
118
+ | `vs_baseline_pct` | (agent_score − heuristic_baseline) / heuristic_baseline | Empirical baseline loaded from `plots/training_summary.json["smart_heuristic"]` (0.43 / 0.77 / 0.81) |
119
+ | `score_per_tool_call` | grader_score / total_tool_calls | Efficiency: did the agent learn to call tools sparingly? |
120
+ | `score_per_1k_chars` | grader_score per 1k action JSON chars | Token-proxy efficiency |
121
+ | `retention_under_shift` | shifted_score / baseline_score | Pass `episode_chain_id` + `shift_label="baseline"` then `="shifted"` to a second `reset` to populate. None until both runs complete. |
122
+
123
  ## Tool catalog
124
 
125
  | Tool | Cost | Returns |
RESEARCH.md CHANGED
@@ -135,7 +135,7 @@ Every constant and design decision in Viraltest is backed by a verifiable source
135
 
136
  **Key findings:** 3–5 posts/week doubles follower growth vs 1–2. 7+/week shows 20–35% engagement drop per post. Diminishing returns above 5/week.
137
 
138
- **What we use:** `FATIGUE_TIERS`, `WEEKLY_FATIGUE_THRESHOLD = 7`, `_theoretical_max_engagement` uses 5 posts/week × 4 weeks.
139
 
140
  ---
141
 
@@ -196,6 +196,42 @@ Every constant and design decision in Viraltest is backed by a verifiable source
196
 
197
  ---
198
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
199
  ### Goldman Sachs Global Investment Research (March 2025)
200
 
201
  **Title:** Creator Economy: Framing the Market Opportunity
 
135
 
136
  **Key findings:** 3–5 posts/week doubles follower growth vs 1–2. 7+/week shows 20–35% engagement drop per post. Diminishing returns above 5/week.
137
 
138
+ **What we use:** `FATIGUE_TIERS`, `WEEKLY_FATIGUE_THRESHOLD = 7`, `_theoretical_max_engagement` caps at 5 posts/week × `TASK_HORIZON/7` weeks (≈21 posts for 30-day horizon — the Buffer-defined sweet spot before fatigue penalties kick in).
139
 
140
  ---
141
 
 
196
 
197
  ---
198
 
199
+ ### Later (2023) — Instagram Collaboration Posts Performance Study
200
+
201
+ **URL:** [later.com/blog/instagram-collab-posts](https://later.com/blog/instagram-collab-posts)
202
+ **Sample:** ~5K co-authored posts across the Later customer base (disclosed)
203
+ **Methodology:** Comparison of Collab posts (single post shared to two feeds) vs equivalent solo posts from the same accounts.
204
+
205
+ **Key findings:** Collab posts averaged ~88% more reach and ~40% more impressions than solo posts. Lift driven primarily by exposure to the partner's audience.
206
+
207
+ **What we use:** `COLLAB_REACH_K = 0.60` — reach uplift scales with `(1 - overlap)` and is capped below the headline 88% because reach in our model is already amplified by `REACH_MULT` and `hour_mult`; net post-cap uplift on the constrained engagement value lands in the +30–50% band Later reports for matched-niche pairs.
208
+
209
+ ---
210
+
211
+ ### HypeAuditor (2024) — Influencer Collaboration Benchmark
212
+
213
+ **URL:** [hypeauditor.com/blog/influencer-collaboration](https://hypeauditor.com/blog/influencer-collaboration)
214
+ **Sample:** 10K+ Instagram collaboration posts across niches
215
+ **Methodology:** Per-impression engagement rate, segmented by niche affinity (same niche, adjacent, cross-niche).
216
+
217
+ **Key findings:** Same-niche collabs achieve ~30% higher engagement-per-impression than cross-niche; cross-niche collabs gain new followers but per-impression rate is roughly flat or slightly negative.
218
+
219
+ **What we use:** `COLLAB_AFFINITY_K = 0.30` — engagement-per-impression boost scales with `overlap`, peaking when the partner's audience already shares the user's niche.
220
+
221
+ ---
222
+
223
+ ### Rival IQ (2025) — Cross-Industry Audience Overlap Patterns
224
+
225
+ **URL:** [rivaliq.com/blog/social-media-industry-benchmark-report](https://www.rivaliq.com/blog/social-media-industry-benchmark-report/) (cross-industry chapter)
226
+
227
+ **Key findings:** Same-industry account pairs share 40–65% of their audience; adjacent industries 20–35%; unrelated industries 5–15%. Cross-industry collabs drive new follower acquisition at roughly 2–2.5× the rate of same-industry collabs.
228
+
229
+ **What we use:** `audience_overlap_matrix.json` values and `COLLAB_GROWTH_K = 1.50` — follower spillover scales with `(1 - overlap)`, peaking at +150% when overlap is zero (matches the upper end of Rival IQ's cross-industry follower-acquisition lift).
230
+
231
+ Per-episode collab cadence is **not hard-capped**. Instead, each successive collab in a month is multiplied by `1 / (1 + COLLAB_FATIGUE_K · prior_collabs)` (`K = 0.3`): the multiplier falls to ~77% on the 2nd, 63% on the 3rd, 53% on the 4th. With base `engagement ≈ 1.52×` from a typical-overlap partner, this puts the 1st–2nd collab clearly above the no-collab baseline, the 3rd roughly neutral, and the 4th+ net-negative. This follows Cen et al. 2024's argument that disengagement-aware policies should price marginal exposure rather than impose binary caps, and lets the policy discover its own collab frequency from reward gradient.
232
+
233
+ ---
234
+
235
  ### Goldman Sachs Global Investment Research (March 2025)
236
 
237
  **Title:** Creator Economy: Framing the Market Opportunity
inference.py CHANGED
@@ -35,7 +35,7 @@ _REQUESTED_MAX = int(os.getenv("MAX_STEPS", str(TASK_HORIZON)))
35
  MAX_STEPS = _REQUESTED_MAX if _ALLOW_SHORT else max(_REQUESTED_MAX, TASK_HORIZON)
36
  TEMPERATURE = 0.7
37
  MAX_TOKENS = 768
38
- SUCCESS_SCORE_THRESHOLD = 0.1
39
 
40
  ALL_TOPICS: List[str] = [
41
  topic for topics in TOPIC_CATEGORIES.values() for topic in topics
@@ -111,11 +111,24 @@ def log_step(step: int, action: str, reward: float, done: bool, error: Optional[
111
  )
112
 
113
 
114
- def log_end(success: bool, steps: int, score: float, rewards: List[float]) -> None:
 
 
 
115
  rewards_str = ",".join(f"{r:.2f}" for r in rewards)
 
 
 
 
 
 
 
 
 
 
116
  print(
117
  f"[END] success={str(success).lower()} steps={steps} "
118
- f"score={score:.2f} rewards={rewards_str}",
119
  flush=True,
120
  )
121
 
@@ -140,6 +153,14 @@ def format_observation(obs: Any) -> str:
140
  if coach:
141
  coach_str = f"Coach: delta={coach.get('delta', 0):.3f}, suggestion={coach.get('suggestion', '')}\n"
142
 
 
 
 
 
 
 
 
 
143
  signals = getattr(obs, "engagement_signals", None)
144
  signals_str = ""
145
  if signals:
@@ -153,7 +174,7 @@ Day: {day_name} (day_of_week={obs.day_of_week}) | days_elapsed={obs.days_elapsed
153
  Energy: {obs.creator_energy:.2f} | Burnout risk: {burnout:.2f} | Followers: {obs.follower_count}
154
  Engagement rate: {obs.engagement_rate:.3f} | Content queue: {obs.content_queue_size}
155
  API budget remaining: {budget}
156
- {signals_str}{coach_str}Tool results from last step:
157
  {tool_results_str if tool_results_str else ' (none)\n'}Your notes from last step: {notes_echo}
158
  Plan your tool calls and actions for today:""")
159
 
@@ -282,6 +303,7 @@ async def run_task(client: OpenAI, task: str) -> None:
282
  score = 0.0
283
  success = False
284
  env: Optional[ViraltestEnv] = None
 
285
 
286
  log_start(task=task, env=BENCHMARK, model=MODEL_NAME)
287
 
@@ -336,6 +358,7 @@ async def run_task(client: OpenAI, task: str) -> None:
336
  if score == 0:
337
  meta = getattr(result.observation, "metadata", {}) or {}
338
  score = float(meta.get("grader_score", 0.0))
 
339
  break
340
 
341
  success = score >= SUCCESS_SCORE_THRESHOLD
@@ -346,7 +369,7 @@ async def run_task(client: OpenAI, task: str) -> None:
346
  await env.close()
347
  except Exception as e:
348
  print(f"[DEBUG] env.close() error: {e}", flush=True)
349
- log_end(success=success, steps=steps_taken, score=score, rewards=rewards)
350
 
351
 
352
  async def main() -> None:
 
35
  MAX_STEPS = _REQUESTED_MAX if _ALLOW_SHORT else max(_REQUESTED_MAX, TASK_HORIZON)
36
  TEMPERATURE = 0.7
37
  MAX_TOKENS = 768
38
+ SUCCESS_SCORE_THRESHOLD = 0.50
39
 
40
  ALL_TOPICS: List[str] = [
41
  topic for topics in TOPIC_CATEGORIES.values() for topic in topics
 
111
  )
112
 
113
 
114
+ def log_end(
115
+ success: bool, steps: int, score: float, rewards: List[float],
116
+ headline: Optional[Any] = None,
117
+ ) -> None:
118
  rewards_str = ",".join(f"{r:.2f}" for r in rewards)
119
+ head_str = ""
120
+ if headline is not None:
121
+ retention = headline.retention_under_shift
122
+ retention_str = f"{retention:.2f}" if retention is not None else "n/a"
123
+ head_str = (
124
+ f" vs_baseline_pct={headline.vs_baseline_pct:+.2%} "
125
+ f"score_per_tool={headline.score_per_tool_call:.3f} "
126
+ f"score_per_1k_chars={headline.score_per_1k_chars:.3f} "
127
+ f"retention_under_shift={retention_str}"
128
+ )
129
  print(
130
  f"[END] success={str(success).lower()} steps={steps} "
131
+ f"score={score:.2f} rewards={rewards_str}{head_str}",
132
  flush=True,
133
  )
134
 
 
153
  if coach:
154
  coach_str = f"Coach: delta={coach.get('delta', 0):.3f}, suggestion={coach.get('suggestion', '')}\n"
155
 
156
+ judge = getattr(obs, "judge_report", None)
157
+ judge_str = ""
158
+ if judge:
159
+ judge_str = (
160
+ f"Judge: compliance={judge.policy_compliance:.2f} risk={judge.sustainability_risk:.2f} "
161
+ f"strategy={judge.strategic_quality:.2f} | {judge.explanation}\n"
162
+ )
163
+
164
  signals = getattr(obs, "engagement_signals", None)
165
  signals_str = ""
166
  if signals:
 
174
  Energy: {obs.creator_energy:.2f} | Burnout risk: {burnout:.2f} | Followers: {obs.follower_count}
175
  Engagement rate: {obs.engagement_rate:.3f} | Content queue: {obs.content_queue_size}
176
  API budget remaining: {budget}
177
+ {signals_str}{coach_str}{judge_str}Tool results from last step:
178
  {tool_results_str if tool_results_str else ' (none)\n'}Your notes from last step: {notes_echo}
179
  Plan your tool calls and actions for today:""")
180
 
 
303
  score = 0.0
304
  success = False
305
  env: Optional[ViraltestEnv] = None
306
+ headline: Optional[Any] = None
307
 
308
  log_start(task=task, env=BENCHMARK, model=MODEL_NAME)
309
 
 
358
  if score == 0:
359
  meta = getattr(result.observation, "metadata", {}) or {}
360
  score = float(meta.get("grader_score", 0.0))
361
+ headline = getattr(result.observation, "headline_metrics", None)
362
  break
363
 
364
  success = score >= SUCCESS_SCORE_THRESHOLD
 
369
  await env.close()
370
  except Exception as e:
371
  print(f"[DEBUG] env.close() error: {e}", flush=True)
372
+ log_end(success=success, steps=steps_taken, score=score, rewards=rewards, headline=headline)
373
 
374
 
375
  async def main() -> None:
models.py CHANGED
@@ -108,6 +108,35 @@ class ViraltestAction(Action):
108
  return deduped
109
 
110
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
111
  class EngagementSignals(BaseModel):
112
  """Mosseri-aligned engagement decomposition (Jan 2025 official ranking signals)."""
113
 
@@ -161,6 +190,14 @@ class ViraltestObservation(Observation):
161
  default=None,
162
  description="Counterfactual feedback: delta between agent plan and heatmap-optimal plan",
163
  )
 
 
 
 
 
 
 
 
164
 
165
  tool_results: List[ToolResult] = Field(default_factory=list, description="Results from tool_calls this step")
166
  agent_notes: Optional[str] = Field(default=None, description="Echo of agent's notes from previous step")
 
108
  return deduped
109
 
110
 
111
+ class JudgeReport(BaseModel):
112
+ """Auditable per-day evaluation by the in-env Regulator/Judge.
113
+
114
+ Scores are 0..1. `sustainability_risk` is RISK (higher = worse).
115
+ """
116
+
117
+ policy_compliance: float = Field(default=1.0, ge=0.0, le=1.0)
118
+ sustainability_risk: float = Field(default=0.0, ge=0.0, le=1.0)
119
+ strategic_quality: float = Field(default=0.0, ge=0.0, le=1.0)
120
+ explanation: str = Field(default="")
121
+ violations: List[str] = Field(default_factory=list)
122
+
123
+
124
+ class HeadlineMetrics(BaseModel):
125
+ """Three headline numbers reported once per episode (final observation)."""
126
+
127
+ vs_baseline_pct: float = Field(default=0.0, description="(agent - heuristic_baseline) / heuristic_baseline")
128
+ score_per_tool_call: float = Field(default=0.0, description="grader_score / total_tool_calls (efficiency)")
129
+ score_per_1k_chars: float = Field(default=0.0, description="grader_score per 1k action chars (token-proxy efficiency)")
130
+ retention_under_shift: Optional[float] = Field(
131
+ default=None,
132
+ description="shifted_score / baseline_score, populated when both runs share an episode_chain_id",
133
+ )
134
+ heuristic_baseline_score: float = Field(default=0.0)
135
+ agent_score: float = Field(default=0.0)
136
+ total_tool_calls: int = Field(default=0, ge=0)
137
+ total_action_chars: int = Field(default=0, ge=0)
138
+
139
+
140
  class EngagementSignals(BaseModel):
141
  """Mosseri-aligned engagement decomposition (Jan 2025 official ranking signals)."""
142
 
 
190
  default=None,
191
  description="Counterfactual feedback: delta between agent plan and heatmap-optimal plan",
192
  )
193
+ judge_report: Optional[JudgeReport] = Field(
194
+ default=None,
195
+ description="Regulator/Judge audit: policy compliance, sustainability risk, strategic quality + explanation",
196
+ )
197
+ headline_metrics: Optional[HeadlineMetrics] = Field(
198
+ default=None,
199
+ description="Final-observation hard numbers: improvement vs baseline, efficiency, shift retention",
200
+ )
201
 
202
  tool_results: List[ToolResult] = Field(default_factory=list, description="Results from tool_calls this step")
203
  agent_notes: Optional[str] = Field(default=None, description="Echo of agent's notes from previous step")
server/data/audience_overlap_matrix.json CHANGED
@@ -1,16 +1,17 @@
1
  {
2
  "_meta": {
3
- "description": "7×7 symmetric audience overlap matrix between competitor archetypes. Values 0.0-1.0 represent fraction of shared audience. Used by propose_collab to split engagement. Derived from niche proximity (same-niche pairs ~0.4-0.65, cross-niche ~0.05-0.20).",
4
- "source": "Estimated from Rival IQ 2025 cross-industry overlap patterns + niche proximity heuristic"
5
  },
6
- "archetype_ids": ["niche_expert", "viral_chaser", "lifestyle_blogger", "b2b_thought_leader", "food_creator", "fitness_coach", "travel_creator"],
7
  "matrix": [
8
- [1.00, 0.12, 0.10, 0.40, 0.08, 0.10, 0.15],
9
- [0.12, 1.00, 0.55, 0.10, 0.20, 0.25, 0.30],
10
- [0.10, 0.55, 1.00, 0.15, 0.30, 0.35, 0.40],
11
- [0.40, 0.10, 0.15, 1.00, 0.08, 0.10, 0.12],
12
- [0.08, 0.20, 0.30, 0.08, 1.00, 0.45, 0.35],
13
- [0.10, 0.25, 0.35, 0.10, 0.45, 1.00, 0.30],
14
- [0.15, 0.30, 0.40, 0.12, 0.35, 0.30, 1.00]
 
15
  ]
16
  }
 
1
  {
2
  "_meta": {
3
+ "description": "8x8 symmetric audience overlap matrix between competitor archetypes and the user creator. Values 0.0-1.0 represent fraction of shared audience. Used by propose_collab to compute collab reward multipliers and by query_creator_pool to expose overlap to the agent. Same-niche pairs ~0.4-0.65, cross-niche ~0.05-0.20.",
4
+ "source": "Competitor pairs estimated from Rival IQ 2025 cross-industry overlap patterns + niche proximity heuristic. user_creator row tuned to a generic micro-creator (no locked niche): broad mass-market partners (lifestyle_blogger, viral_chaser) score highest; specialist partners (b2b_thought_leader, niche_expert) score lowest."
5
  },
6
+ "archetype_ids": ["niche_expert", "viral_chaser", "lifestyle_blogger", "b2b_thought_leader", "food_creator", "fitness_coach", "travel_creator", "user_creator"],
7
  "matrix": [
8
+ [1.00, 0.12, 0.10, 0.40, 0.08, 0.10, 0.15, 0.10],
9
+ [0.12, 1.00, 0.55, 0.10, 0.20, 0.25, 0.30, 0.35],
10
+ [0.10, 0.55, 1.00, 0.15, 0.30, 0.35, 0.40, 0.40],
11
+ [0.40, 0.10, 0.15, 1.00, 0.08, 0.10, 0.12, 0.08],
12
+ [0.08, 0.20, 0.30, 0.08, 1.00, 0.45, 0.35, 0.25],
13
+ [0.10, 0.25, 0.35, 0.10, 0.45, 1.00, 0.30, 0.28],
14
+ [0.15, 0.30, 0.40, 0.12, 0.35, 0.30, 1.00, 0.30],
15
+ [0.10, 0.35, 0.40, 0.08, 0.25, 0.28, 0.30, 1.00]
16
  ]
17
  }
server/viraltest_environment.py CHANGED
@@ -27,6 +27,8 @@ try:
27
  from ..models import (
28
  CollabProposal,
29
  EngagementSignals,
 
 
30
  ReplyAction,
31
  ScheduledAction,
32
  ToolCall,
@@ -38,6 +40,8 @@ except ImportError:
38
  from models import (
39
  CollabProposal,
40
  EngagementSignals,
 
 
41
  ReplyAction,
42
  ScheduledAction,
43
  ToolCall,
@@ -156,11 +160,41 @@ WEEKLY_FATIGUE_MULT = 0.75
156
 
157
  SATURATION_PENALTY_K = 0.25
158
  TREND_DEFAULT_HALFLIFE_HOURS = 60
159
- COLLAB_MAX_PER_MONTH = 2
 
 
 
 
 
 
 
160
  REPLY_WINDOW_MINUTES = 90
161
  REPLY_REACH_BONUS = 1.4
162
  API_BUDGET_INITIAL = 100
163
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
164
  # Tool costs
165
  TOOL_COSTS = {
166
  "query_audience": 2,
@@ -231,7 +265,7 @@ TOOL_CATALOG = {
231
  "parameters": {},
232
  },
233
  "propose_collab": {
234
- "description": "Propose a collaboration post with a competitor. Splits engagement by audience overlap. Max 2 per month.",
235
  "parameters": {
236
  "partner_id": {"type": "string"},
237
  "content_type": {"type": "string", "enum": ["reel", "story", "carousel", "text_post"]},
@@ -280,10 +314,15 @@ class ViraltestEnvironment(Environment):
280
  self._api_budget = API_BUDGET_INITIAL
281
  self._collabs_this_month = 0
282
  self._collab_history: List[str] = []
 
283
  self._low_energy_days = 0
284
  self._total_posts_this_week = 0
285
  self._week_start_day = 0
286
  self._daily_signals = EngagementSignals()
 
 
 
 
287
 
288
  self._trending_topics = self._pick_trending_topics()
289
  self._trending_tags = self._pick_trending_tags()
@@ -468,6 +507,32 @@ class ViraltestEnvironment(Environment):
468
 
469
  return daily_fatigue * weekly_mult
470
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
471
  # ----- engagement signals (Mosseri-aligned) -----
472
 
473
  def _compute_engagement_signals(
@@ -556,19 +621,17 @@ class ViraltestEnvironment(Environment):
556
  elif tool.name == "query_creator_pool":
557
  pool = []
558
  for comp in self._competitors:
559
- idx = _OVERLAP_DATA["archetype_ids"].index(comp.id) if comp.id in _OVERLAP_DATA["archetype_ids"] else -1
560
- overlap = 0.15
561
- if idx >= 0 and idx < len(_OVERLAP_DATA["matrix"]):
562
- overlap = max(_OVERLAP_DATA["matrix"][idx])
563
- pool.append({"id": comp.id, "name": comp.name, "niche": comp.niche, "max_audience_overlap": round(overlap, 2)})
564
  return ToolResult(name=tool.name, data=pool, budget_remaining=self._api_budget)
565
 
566
  elif tool.name == "propose_collab":
567
- if self._collabs_this_month >= COLLAB_MAX_PER_MONTH:
568
- return ToolResult(name=tool.name, success=False, error="collab_limit_reached", budget_remaining=self._api_budget)
569
  partner_id = tool.arguments.get("partner_id", "")
570
- if partner_id in self._collab_history[-3:]:
571
- return ToolResult(name=tool.name, success=False, error="recently_collaborated", budget_remaining=self._api_budget)
572
  return ToolResult(name=tool.name, data={"status": "proposal_accepted", "partner_id": partner_id}, budget_remaining=self._api_budget)
573
 
574
  return ToolResult(name=tool.name, success=False, error=f"unknown tool: {tool.name}", budget_remaining=self._api_budget)
@@ -576,6 +639,9 @@ class ViraltestEnvironment(Environment):
576
  # ----- counterfactual coach -----
577
 
578
  def _compute_coach_feedback(self, agent_engagement: float) -> Dict[str, Any]:
 
 
 
579
  dow = self._day % 7
580
  row = _HEATMAP_GRID.get(dow, [1.0] * 24)
581
  best_hours = sorted(range(24), key=lambda h: row[h] if h < len(row) else 0, reverse=True)[:2]
@@ -584,13 +650,98 @@ class ViraltestEnvironment(Environment):
584
  optimal_eng = sum(row[h] * best_base * best_reach for h in best_hours)
585
  delta = agent_engagement - optimal_eng
586
  return {
587
- "optimal_hours": best_hours,
588
- "optimal_engagement_estimate": round(optimal_eng, 4),
589
- "your_engagement": round(agent_engagement, 4),
590
  "delta": round(delta, 4),
591
- "suggestion": "You're outperforming the heatmap baseline!" if delta >= 0 else "Consider posting at peak hours for better reach.",
 
 
 
 
592
  }
593
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
594
  # ----- core API -----
595
 
596
  def reset(self, seed: Optional[int] = None, episode_id: Optional[str] = None, **kwargs: Any) -> ViraltestObservation:
@@ -602,6 +753,9 @@ class ViraltestEnvironment(Environment):
602
  self._state = State(episode_id=episode_id or str(uuid4()), step_count=0)
603
  self._init_state()
604
 
 
 
 
605
  chain_id = kwargs.get("episode_chain_id")
606
  if chain_id and chain_id in _BRAND_STORE:
607
  brand = _BRAND_STORE[chain_id]
@@ -623,16 +777,24 @@ class ViraltestEnvironment(Environment):
623
  if action.notes:
624
  self._agent_notes = action.notes
625
 
626
- # Process tool calls first
 
 
 
 
627
  tool_results: List[ToolResult] = []
628
  for tc in action.tool_calls:
629
  result = self._dispatch_tool(tc)
630
  tool_results.append(result)
 
 
631
 
632
- # Process collab proposal
633
- if action.collab and self._collabs_this_month < COLLAB_MAX_PER_MONTH:
 
634
  self._collabs_this_month += 1
635
  self._collab_history.append(action.collab.partner_id)
 
636
 
637
  # Validate scheduled actions
638
  schedule: Dict[int, ScheduledAction] = {}
@@ -718,10 +880,12 @@ class ViraltestEnvironment(Environment):
718
 
719
  done = self._state.step_count >= TASK_HORIZON or self._energy <= 0.0
720
  coach = self._compute_coach_feedback(daily_engagement)
 
721
 
722
  if done:
723
  self._episode_done = True
724
  grader_score = self._run_grader()
 
725
 
726
  chain_id = kwargs.get("episode_chain_id")
727
  if chain_id:
@@ -738,7 +902,7 @@ class ViraltestEnvironment(Environment):
738
  grader_score=grader_score, daily_total_engagement=daily_engagement,
739
  daily_posts_made=daily_posts, daily_energy_min=energy_min,
740
  tool_results=tool_results, engagement_signals=daily_signals,
741
- coach_feedback=coach,
742
  )
743
  return self._final_observation
744
 
@@ -747,13 +911,15 @@ class ViraltestEnvironment(Environment):
747
  daily_total_engagement=daily_engagement,
748
  daily_posts_made=daily_posts, daily_energy_min=energy_min,
749
  tool_results=tool_results, engagement_signals=daily_signals,
750
- coach_feedback=coach,
751
  )
752
 
753
  def _process_hour_action(self, sa: ScheduledAction) -> Tuple[float, float, Optional[EngagementSignals]]:
754
  engagement = 0.0
755
  signals = None
756
 
 
 
757
  if sa.action_type == "post":
758
  cost = CONTENT_ENERGY_COST.get(sa.content_type, 0.1)
759
  if self._content_queue > 0:
@@ -790,6 +956,12 @@ class ViraltestEnvironment(Environment):
790
  * trending_bonus * comp_diff * fatigue * algo_mult
791
  * niche_mult * saturation_factor
792
  )
 
 
 
 
 
 
793
  engagement = min(engagement, 5.0)
794
 
795
  signals = self._compute_engagement_signals(sa.content_type, engagement, sa.intent)
@@ -819,7 +991,7 @@ class ViraltestEnvironment(Environment):
819
  self._time_since_last_post = 0
820
 
821
  if engagement > 0:
822
- self._followers += int(engagement * 100)
823
 
824
  elif sa.action_type == "create_content":
825
  self._energy = max(0.0, self._energy - CREATE_CONTENT_COST)
@@ -955,6 +1127,8 @@ class ViraltestEnvironment(Environment):
955
  tool_results: Optional[List[ToolResult]] = None,
956
  engagement_signals: Optional[EngagementSignals] = None,
957
  coach_feedback: Optional[Dict[str, Any]] = None,
 
 
958
  ) -> ViraltestObservation:
959
  recent_eng = self._engagement_history[-10:] if self._engagement_history else []
960
  eng_rate = sum(recent_eng) / len(recent_eng) if recent_eng else 0.0
@@ -984,6 +1158,8 @@ class ViraltestEnvironment(Environment):
984
  daily_energy_min=round(daily_energy_min, 3),
985
  engagement_signals=engagement_signals,
986
  coach_feedback=coach_feedback,
 
 
987
  tool_results=tool_results or [],
988
  agent_notes=self._agent_notes,
989
  api_budget_remaining=self._api_budget,
@@ -1006,35 +1182,33 @@ class ViraltestEnvironment(Environment):
1006
  return 0.0
1007
 
1008
  def _theoretical_max_engagement(self) -> float:
 
 
 
1009
  best_base = max(BASE_ENGAGEMENT.values())
1010
  best_reach = max(REACH_MULT.values())
1011
  best_niche = max(_NICHE_MULTIPLIERS.values()) if _NICHE_MULTIPLIERS else 1.0
1012
 
1013
- active_days = 26
1014
- rest_days = TASK_HORIZON - active_days
1015
- posts_per_active_day = 2
1016
 
1017
  avg_heatmap_peak = 1.0
1018
  if _HEATMAP_GRID:
1019
- day_peaks = []
1020
- for dow, row in _HEATMAP_GRID.items():
1021
- top2 = sorted(row, reverse=True)[:posts_per_active_day]
1022
- day_peaks.append(sum(top2) / len(top2) if top2 else 1.0)
1023
  avg_heatmap_peak = sum(day_peaks) / len(day_peaks) if day_peaks else 1.0
1024
 
 
 
1025
  trending_bonus = 1.25
1026
  tag_boost = 1.1
1027
 
1028
- total_posts = active_days * posts_per_active_day
1029
-
1030
- weekly_fatigue = 1.0
1031
- posts_per_week = total_posts / (TASK_HORIZON / 7.0)
1032
- if posts_per_week >= WEEKLY_FATIGUE_THRESHOLD:
1033
- weekly_fatigue = WEEKLY_FATIGUE_MULT
1034
-
1035
  per_post = (
1036
  best_base * best_reach * best_niche
1037
- * avg_heatmap_peak * trending_bonus * tag_boost * weekly_fatigue
1038
  )
1039
  return per_post * total_posts
1040
 
 
27
  from ..models import (
28
  CollabProposal,
29
  EngagementSignals,
30
+ HeadlineMetrics,
31
+ JudgeReport,
32
  ReplyAction,
33
  ScheduledAction,
34
  ToolCall,
 
40
  from models import (
41
  CollabProposal,
42
  EngagementSignals,
43
+ HeadlineMetrics,
44
+ JudgeReport,
45
  ReplyAction,
46
  ScheduledAction,
47
  ToolCall,
 
160
 
161
  SATURATION_PENALTY_K = 0.25
162
  TREND_DEFAULT_HALFLIFE_HOURS = 60
163
+ # Collab reward shaping (Later 2023 reach study, HypeAuditor 2024 niche affinity, Rival IQ 2025 overlap patterns,
164
+ # Cen et al. 2024 disengagement model for diminishing returns instead of a hard cap).
165
+ COLLAB_REACH_K = 0.60 # cross-audience exposure: capped reach uplift when overlap is 0
166
+ COLLAB_AFFINITY_K = 0.30 # same-audience affinity: per-impression engagement uplift when overlap is 1
167
+ COLLAB_GROWTH_K = 1.50 # cross-pollination follower spillover, scales (1 - overlap)
168
+ COLLAB_PARTNER_REPEAT_PENALTY = 0.7 # discount on multipliers when partner reused this brand
169
+ COLLAB_FATIGUE_K = 0.3 # per-collab diminishing-returns factor: 1/(1+K*prior_collabs_this_episode)
170
+
171
  REPLY_WINDOW_MINUTES = 90
172
  REPLY_REACH_BONUS = 1.4
173
  API_BUDGET_INITIAL = 100
174
 
175
+ # Heuristic baselines for headline metric `vs_baseline_pct`.
176
+ # Data-driven: loaded from `plots/training_summary.json["smart_heuristic"]` recorded by
177
+ # `training/run_training_evidence.py`. Falls back to conservative calibration constants
178
+ # if the file is missing (audit trail: see RESEARCH.md for the rule-based policy spec).
179
+ def _load_heuristic_baselines() -> Dict[str, float]:
180
+ summary = Path(__file__).parent.parent / "plots" / "training_summary.json"
181
+ try:
182
+ data = json.loads(summary.read_text())
183
+ empirical = data.get("smart_heuristic") or {}
184
+ return {k: float(v) for k, v in empirical.items() if k in VALID_TASKS}
185
+ except Exception:
186
+ return {}
187
+
188
+ HEURISTIC_BASELINE_SCORES: Dict[str, float] = _load_heuristic_baselines() or {
189
+ "monthly_engage": 0.43,
190
+ "monthly_strategic": 0.77,
191
+ "monthly_competitive": 0.81,
192
+ }
193
+
194
+ # Cross-episode store for distribution-shift retention. Keyed by episode_chain_id, stores
195
+ # {"baseline": score, "shifted": score} so the second run can compute retention_under_shift.
196
+ _SHIFT_HISTORY: Dict[str, Dict[str, float]] = {}
197
+
198
  # Tool costs
199
  TOOL_COSTS = {
200
  "query_audience": 2,
 
265
  "parameters": {},
266
  },
267
  "propose_collab": {
268
+ "description": "Propose a collab post with a competitor at a specific hour. The post you schedule at that hour will be co-authored with the partner.",
269
  "parameters": {
270
  "partner_id": {"type": "string"},
271
  "content_type": {"type": "string", "enum": ["reel", "story", "carousel", "text_post"]},
 
314
  self._api_budget = API_BUDGET_INITIAL
315
  self._collabs_this_month = 0
316
  self._collab_history: List[str] = []
317
+ self._active_collab: Optional[CollabProposal] = None
318
  self._low_energy_days = 0
319
  self._total_posts_this_week = 0
320
  self._week_start_day = 0
321
  self._daily_signals = EngagementSignals()
322
+ self._total_tool_calls = 0
323
+ self._total_action_chars = 0
324
+ self._shift_label: Optional[str] = None
325
+ self._chain_id: Optional[str] = None
326
 
327
  self._trending_topics = self._pick_trending_topics()
328
  self._trending_tags = self._pick_trending_tags()
 
507
 
508
  return daily_fatigue * weekly_mult
509
 
510
+ # ----- collab multipliers (overlap-driven) -----
511
+
512
+ def _user_partner_overlap(self, partner_id: str) -> Optional[float]:
513
+ ids = _OVERLAP_DATA.get("archetype_ids", [])
514
+ if "user_creator" not in ids or partner_id not in ids:
515
+ return None
516
+ u = ids.index("user_creator")
517
+ p = ids.index(partner_id)
518
+ return _OVERLAP_DATA["matrix"][u][p]
519
+
520
+ def _collab_multipliers(self, partner_id: str) -> Tuple[float, float]:
521
+ """Returns (engagement_multiplier, follower_growth_multiplier)."""
522
+ o = self._user_partner_overlap(partner_id)
523
+ if o is None:
524
+ return 1.0, 1.0
525
+ reach = 1.0 + (1.0 - o) * COLLAB_REACH_K
526
+ affinity = 1.0 + o * COLLAB_AFFINITY_K
527
+ growth = 1.0 + (1.0 - o) * COLLAB_GROWTH_K
528
+ eng_boost = reach * affinity
529
+ if partner_id in self._collab_history[:-1]:
530
+ eng_boost *= COLLAB_PARTNER_REPEAT_PENALTY
531
+ growth *= COLLAB_PARTNER_REPEAT_PENALTY
532
+ prior = max(0, self._collabs_this_month - 1)
533
+ fatigue = 1.0 / (1.0 + COLLAB_FATIGUE_K * prior)
534
+ return eng_boost * fatigue, growth * fatigue
535
+
536
  # ----- engagement signals (Mosseri-aligned) -----
537
 
538
  def _compute_engagement_signals(
 
621
  elif tool.name == "query_creator_pool":
622
  pool = []
623
  for comp in self._competitors:
624
+ overlap = self._user_partner_overlap(comp.id)
625
+ pool.append({
626
+ "id": comp.id, "name": comp.name, "niche": comp.niche,
627
+ "audience_overlap": round(overlap, 2) if overlap is not None else None,
628
+ })
629
  return ToolResult(name=tool.name, data=pool, budget_remaining=self._api_budget)
630
 
631
  elif tool.name == "propose_collab":
 
 
632
  partner_id = tool.arguments.get("partner_id", "")
633
+ if partner_id not in [c.id for c in self._competitors]:
634
+ return ToolResult(name=tool.name, success=False, error=f"unknown partner: {partner_id}", budget_remaining=self._api_budget)
635
  return ToolResult(name=tool.name, data={"status": "proposal_accepted", "partner_id": partner_id}, budget_remaining=self._api_budget)
636
 
637
  return ToolResult(name=tool.name, success=False, error=f"unknown tool: {tool.name}", budget_remaining=self._api_budget)
 
639
  # ----- counterfactual coach -----
640
 
641
  def _compute_coach_feedback(self, agent_engagement: float) -> Dict[str, Any]:
642
+ # World-modeling discipline: emit a SCALAR delta only (no optimal_hours leak).
643
+ # Agents must use `query_trends` / `predict_engagement` to discover *which* hours
644
+ # are optimal — coach only signals "you're above/below the heatmap optimum today".
645
  dow = self._day % 7
646
  row = _HEATMAP_GRID.get(dow, [1.0] * 24)
647
  best_hours = sorted(range(24), key=lambda h: row[h] if h < len(row) else 0, reverse=True)[:2]
 
650
  optimal_eng = sum(row[h] * best_base * best_reach for h in best_hours)
651
  delta = agent_engagement - optimal_eng
652
  return {
 
 
 
653
  "delta": round(delta, 4),
654
+ "suggestion": (
655
+ "Above heatmap optimum today."
656
+ if delta >= 0
657
+ else "Below heatmap optimum — try `query_trends` / `predict_engagement` to find peak hours."
658
+ ),
659
  }
660
 
661
+ # ----- regulator / judge mode (deterministic, explainable) -----
662
+
663
+ def _compute_judge_report(
664
+ self,
665
+ action: ViraltestAction,
666
+ daily_engagement: float,
667
+ daily_posts: int,
668
+ energy_min: float,
669
+ errors: List[str],
670
+ ) -> JudgeReport:
671
+ violations: List[str] = []
672
+
673
+ pc = 1.0
674
+ if daily_posts > 5:
675
+ violations.append(f"posts_today={daily_posts} exceeds tier-4 fatigue cliff (Buffer 2.1M)")
676
+ pc -= 0.30
677
+ elif daily_posts > 2:
678
+ violations.append(f"posts_today={daily_posts} enters fatigue tier (>2/day)")
679
+ pc -= 0.10
680
+ if self._total_posts_this_week > WEEKLY_FATIGUE_THRESHOLD:
681
+ violations.append(f"weekly posts={self._total_posts_this_week} > {WEEKLY_FATIGUE_THRESHOLD} (Buffer 2.1M cap)")
682
+ pc -= 0.20
683
+ if self._collabs_this_month >= 4:
684
+ violations.append(f"collab cadence={self._collabs_this_month} net-negative beyond 3 (Cen 2024)")
685
+ pc -= 0.20
686
+ if errors:
687
+ violations.append(f"plan_errors={len(errors)}")
688
+ pc -= 0.05 * len(errors)
689
+ if self._hours_since_sleep > 22:
690
+ violations.append(f"sleep_debt: {self._hours_since_sleep}h awake (Van Dongen 2003)")
691
+ pc -= 0.10
692
+
693
+ burnout_pressure = (1.0 - energy_min) * 0.4 + self._sleep_debt * 0.3 + (self._low_energy_days / 5.0) * 0.3
694
+ sustainability_risk = max(0.0, min(1.0, burnout_pressure))
695
+
696
+ intents_used = {sa.intent for sa in action.scheduled_actions if sa.intent}
697
+ formats_used = {sa.content_type for sa in action.scheduled_actions if sa.action_type == "post" and sa.content_type}
698
+ eng_per_post = daily_engagement / max(1, daily_posts)
699
+ sq = (
700
+ 0.40 * min(1.0, eng_per_post / 1.2)
701
+ + 0.30 * min(1.0, len(intents_used) / 2.0)
702
+ + 0.30 * min(1.0, len(formats_used) / 2.0)
703
+ )
704
+
705
+ explanation = (
706
+ f"compliance={max(0.0, pc):.2f} risk={sustainability_risk:.2f} strategy={sq:.2f} | "
707
+ + (("violations: " + "; ".join(violations)) if violations else "no policy violations")
708
+ )
709
+
710
+ return JudgeReport(
711
+ policy_compliance=max(0.0, min(1.0, pc)),
712
+ sustainability_risk=sustainability_risk,
713
+ strategic_quality=max(0.0, min(1.0, sq)),
714
+ explanation=explanation,
715
+ violations=violations,
716
+ )
717
+
718
+ def _compute_headline_metrics(self, grader_score: float) -> HeadlineMetrics:
719
+ baseline = HEURISTIC_BASELINE_SCORES.get(self._task, 0.30)
720
+ vs_pct = (grader_score - baseline) / baseline if baseline > 0 else 0.0
721
+ spt = grader_score / max(1, self._total_tool_calls)
722
+ sp1k = grader_score / max(1.0, self._total_action_chars / 1000.0)
723
+
724
+ retention: Optional[float] = None
725
+ if self._chain_id:
726
+ entry = _SHIFT_HISTORY.setdefault(self._chain_id, {})
727
+ label = self._shift_label or "baseline"
728
+ entry[label] = grader_score
729
+ base = entry.get("baseline")
730
+ shifted = entry.get("shifted")
731
+ if base is not None and shifted is not None and base > 0:
732
+ retention = shifted / base
733
+
734
+ return HeadlineMetrics(
735
+ vs_baseline_pct=round(vs_pct, 4),
736
+ score_per_tool_call=round(spt, 4),
737
+ score_per_1k_chars=round(sp1k, 4),
738
+ retention_under_shift=round(retention, 4) if retention is not None else None,
739
+ heuristic_baseline_score=round(baseline, 4),
740
+ agent_score=round(grader_score, 4),
741
+ total_tool_calls=self._total_tool_calls,
742
+ total_action_chars=self._total_action_chars,
743
+ )
744
+
745
  # ----- core API -----
746
 
747
  def reset(self, seed: Optional[int] = None, episode_id: Optional[str] = None, **kwargs: Any) -> ViraltestObservation:
 
753
  self._state = State(episode_id=episode_id or str(uuid4()), step_count=0)
754
  self._init_state()
755
 
756
+ self._shift_label = kwargs.get("shift_label")
757
+ self._chain_id = kwargs.get("episode_chain_id")
758
+
759
  chain_id = kwargs.get("episode_chain_id")
760
  if chain_id and chain_id in _BRAND_STORE:
761
  brand = _BRAND_STORE[chain_id]
 
777
  if action.notes:
778
  self._agent_notes = action.notes
779
 
780
+ try:
781
+ self._total_action_chars += len(action.model_dump_json())
782
+ except Exception:
783
+ pass
784
+
785
  tool_results: List[ToolResult] = []
786
  for tc in action.tool_calls:
787
  result = self._dispatch_tool(tc)
788
  tool_results.append(result)
789
+ if result.success:
790
+ self._total_tool_calls += 1
791
 
792
+ # Process collab proposal (no hard cap; diminishing returns enforced via _collab_multipliers)
793
+ self._active_collab = None
794
+ if action.collab:
795
  self._collabs_this_month += 1
796
  self._collab_history.append(action.collab.partner_id)
797
+ self._active_collab = action.collab
798
 
799
  # Validate scheduled actions
800
  schedule: Dict[int, ScheduledAction] = {}
 
880
 
881
  done = self._state.step_count >= TASK_HORIZON or self._energy <= 0.0
882
  coach = self._compute_coach_feedback(daily_engagement)
883
+ judge = self._compute_judge_report(action, daily_engagement, daily_posts, energy_min, errors)
884
 
885
  if done:
886
  self._episode_done = True
887
  grader_score = self._run_grader()
888
+ headline = self._compute_headline_metrics(grader_score)
889
 
890
  chain_id = kwargs.get("episode_chain_id")
891
  if chain_id:
 
902
  grader_score=grader_score, daily_total_engagement=daily_engagement,
903
  daily_posts_made=daily_posts, daily_energy_min=energy_min,
904
  tool_results=tool_results, engagement_signals=daily_signals,
905
+ coach_feedback=coach, judge_report=judge, headline_metrics=headline,
906
  )
907
  return self._final_observation
908
 
 
911
  daily_total_engagement=daily_engagement,
912
  daily_posts_made=daily_posts, daily_energy_min=energy_min,
913
  tool_results=tool_results, engagement_signals=daily_signals,
914
+ coach_feedback=coach, judge_report=judge,
915
  )
916
 
917
  def _process_hour_action(self, sa: ScheduledAction) -> Tuple[float, float, Optional[EngagementSignals]]:
918
  engagement = 0.0
919
  signals = None
920
 
921
+ collab_growth_mult = 1.0
922
+
923
  if sa.action_type == "post":
924
  cost = CONTENT_ENERGY_COST.get(sa.content_type, 0.1)
925
  if self._content_queue > 0:
 
956
  * trending_bonus * comp_diff * fatigue * algo_mult
957
  * niche_mult * saturation_factor
958
  )
959
+
960
+ if self._active_collab is not None and self._active_collab.hour == sa.hour:
961
+ eng_m, growth_m = self._collab_multipliers(self._active_collab.partner_id)
962
+ engagement *= eng_m
963
+ collab_growth_mult = growth_m
964
+
965
  engagement = min(engagement, 5.0)
966
 
967
  signals = self._compute_engagement_signals(sa.content_type, engagement, sa.intent)
 
991
  self._time_since_last_post = 0
992
 
993
  if engagement > 0:
994
+ self._followers += int(engagement * 100 * collab_growth_mult)
995
 
996
  elif sa.action_type == "create_content":
997
  self._energy = max(0.0, self._energy - CREATE_CONTENT_COST)
 
1127
  tool_results: Optional[List[ToolResult]] = None,
1128
  engagement_signals: Optional[EngagementSignals] = None,
1129
  coach_feedback: Optional[Dict[str, Any]] = None,
1130
+ judge_report: Optional[JudgeReport] = None,
1131
+ headline_metrics: Optional[HeadlineMetrics] = None,
1132
  ) -> ViraltestObservation:
1133
  recent_eng = self._engagement_history[-10:] if self._engagement_history else []
1134
  eng_rate = sum(recent_eng) / len(recent_eng) if recent_eng else 0.0
 
1158
  daily_energy_min=round(daily_energy_min, 3),
1159
  engagement_signals=engagement_signals,
1160
  coach_feedback=coach_feedback,
1161
+ judge_report=judge_report,
1162
+ headline_metrics=headline_metrics,
1163
  tool_results=tool_results or [],
1164
  agent_notes=self._agent_notes,
1165
  api_budget_remaining=self._api_budget,
 
1182
  return 0.0
1183
 
1184
  def _theoretical_max_engagement(self) -> float:
1185
+ # Buffer 2.1M (RESEARCH.md): 3–5 posts/week doubles follower growth vs 1–2,
1186
+ # diminishing returns above 5/week, 20–35% engagement drop per post above 7/week.
1187
+ # Cap at 5 posts/week × 4 weeks = 20 posts/month (sweet-spot, no fatigue penalty).
1188
  best_base = max(BASE_ENGAGEMENT.values())
1189
  best_reach = max(REACH_MULT.values())
1190
  best_niche = max(_NICHE_MULTIPLIERS.values()) if _NICHE_MULTIPLIERS else 1.0
1191
 
1192
+ posts_per_week = 5
1193
+ weeks_in_horizon = TASK_HORIZON / 7.0
1194
+ total_posts = int(round(posts_per_week * weeks_in_horizon))
1195
 
1196
  avg_heatmap_peak = 1.0
1197
  if _HEATMAP_GRID:
1198
+ day_peaks = [
1199
+ max(row) if row else 1.0
1200
+ for row in _HEATMAP_GRID.values()
1201
+ ]
1202
  avg_heatmap_peak = sum(day_peaks) / len(day_peaks) if day_peaks else 1.0
1203
 
1204
+ # Trending + tag uplifts: tier-1 industry data shows ~1.2-1.3x for trending topics
1205
+ # and ~1.05-1.15x for high-performance tags. Mid-range used to avoid headroom inflation.
1206
  trending_bonus = 1.25
1207
  tag_boost = 1.1
1208
 
 
 
 
 
 
 
 
1209
  per_post = (
1210
  best_base * best_reach * best_niche
1211
+ * avg_heatmap_peak * trending_bonus * tag_boost
1212
  )
1213
  return per_post * total_posts
1214