Spaces:

abrown31
/

open-range

Runtime error

App Files Files Community

Lars Talian commited on Mar 8

Commit

228ed67

1 Parent(s): a49b769

Make mutation policy weights explicit

Browse files

Files changed (6) hide show

docs/mutation_policy.md +107 -0
scripts/calibrate_mutation_policy.py +131 -0
src/open_range/builder/mutation_policy.py +373 -78
src/open_range/builder/mutator.py +8 -0
src/open_range/server/runtime.py +2 -3
tests/test_mutation_policy.py +177 -1

docs/mutation_policy.md ADDED Viewed

	@@ -0,0 +1,107 @@

+# Mutation Policy Weights
+`PopulationMutationPolicy` is a hand-authored heuristic policy, but its
+weights and shaping constants are now explicit in
+`src/open_range/builder/mutation_policy.py` under `MutationPolicySettings`.
+The policy has three jobs:
+1. Choose which stored snapshot is the best parent to mutate next.
+2. Choose which structural mutation op to apply.
+3. Choose which security/noise mutation op to apply.
+## Parent Selection Terms
+These fields live in `MutationPolicySettings.parent`.
+| Field | Default | Why it exists |
+| --- | ---: | --- |
+| `frontier_weight` | `0.28` | Prefer snapshots near the current learning frontier instead of trivially solved or impossible ones. |
+| `replay_weight` | `0.18` | Revisit under-played snapshots so the curriculum does not collapse to a tiny subset. |
+| `novelty_weight` | `0.16` | Favor rarer vulnerability mixes across the population. |
+| `weak_overlap_weight` | `0.18` | Bias parent choice toward snapshots that exercise known weak areas. |
+| `lineage_balance_weight` | `0.08` | Prevent one root lineage from dominating the pool. |
+| `depth_balance_weight` | `0.04` | Avoid over-sampling very deep descendant chains. |
+| `recency_weight` | `0.04` | Cool down parents that were used repeatedly in the recent window. |
+| `complexity_weight` | `0.04` | Slightly prefer richer parents with more structure to mutate from. |
+Shaping constants in the same model explain how those raw signals are formed:
+| Field | Default | Meaning |
+| --- | ---: | --- |
+| `minimum_total` | `0.05` | Sampling floor for low-scoring parents. |
+| `unplayed_frontier_score` | `0.40` | Frontier score used before any play stats exist. |
+| `empty_vuln_novelty_score` | `0.25` | Novelty fallback for snapshots with no typed vulnerabilities. |
+| `preferred_generation_depth` | `3.0` | Depth after which descendant chains start being penalized. |
+| `complexity_vuln_factor` | `0.25` | Complexity contribution per vulnerability. |
+| `complexity_golden_path_factor` | `0.03` | Complexity contribution per golden-path step. |
+| `complexity_dependency_edge_factor` | `0.02` | Complexity contribution per dependency edge. |
+| `complexity_trust_edge_factor` | `0.02` | Complexity contribution per trust edge. |
+| `complexity_cap` | `1.0` | Cap for the normalized complexity score. |
+## Mutation Selection Terms
+These fields live in `MutationPolicySettings.mutation`.
+| Field | Default | Why it exists |
+| --- | ---: | --- |
+| `curriculum_weight` | `0.38` | Prefer ops that target the agent's current weakness. |
+| `novelty_weight` | `0.24` | Prefer ops that open new surfaces or vary episode shape. |
+| `structural_gain_weight` | `0.28` | Prefer ops that materially expand the scenario graph. |
+| `lineage_weight` | `0.10` | Slight bias toward shallower lineage when all else is equal. |
+| `minimum_total` | `0.05` | Sampling floor for low-scoring mutation ops. |
+Raw novelty bonuses in `MutationPolicySettings.novelty`:
+| Field | Default | Meaning |
+| --- | ---: | --- |
+| `base_bonus` | `0.40` | Baseline novelty for every op. |
+| `new_vuln_class_bonus` | `1.0` | Extra novelty for a vulnerability class not seen recently. |
+| `new_noise_surface_bonus` | `0.50` | Extra novelty for noise on a new attack surface. |
+| `structural_op_bonus` | `0.40` | Extra novelty for non-security ops that change the graph. |
+Raw curriculum bonuses in `MutationPolicySettings.curriculum`:
+| Field | Default | Meaning |
+| --- | ---: | --- |
+| `base_bonus` | `0.35` | Baseline curriculum value for every op. |
+| `weak_area_bonus` | `1.50` | Reward seeding a vulnerability in a known weak area. |
+| `new_vuln_bonus` | `0.40` | Reward introducing a vulnerability class not present in the parent. |
+| `chain_length_bonus` | `0.60` | Reward edges that help satisfy multi-hop chain requirements. |
+| `focus_identity_bonus` | `0.50` | Reward identity-layer ops when curriculum focus is identity. |
+| `focus_infra_bonus` | `0.50` | Reward infra-layer ops when curriculum focus is infra. |
+| `focus_process_bonus` | `0.40` | Reward benign noise when focus is process realism. |
+## Structural Gain Table
+These fields live in `MutationPolicySettings.structural_gains`.
+| Op Type | Default |
+| --- | ---: |
+| `add_service` | `1.00` |
+| `add_dependency_edge` | `0.90` |
+| `add_trust_edge` | `0.85` |
+| `add_user` | `0.80` |
+| `seed_vuln` | `0.70` |
+| `add_benign_noise` | `0.30` |
+| `default_gain` | `0.20` |
+## Tuning Path
+You can swap weights without touching policy code:
+1. Write a JSON or YAML file matching `MutationPolicySettings`.
+2. Load it with `load_mutation_policy_settings(path)` or pass it into `PopulationMutationPolicy(settings=...)`.
+3. Compare it against the default policy with:
+```bash
+PYTHONPATH=src .venv/bin/python scripts/calibrate_mutation_policy.py \
+  --store-dir snapshots \
+  --stats path/to/snapshot_stats.json \
+  --context path/to/build_context.json \
+  --settings tuned=path/to/policy_settings.yaml
+```
+The calibration output is JSON so it can be diffed, archived, or fed into
+notebooks. Parent-selection logs and `MutationPlan.score_breakdown` now expose
+weighted contributions instead of only raw feature values.

scripts/calibrate_mutation_policy.py ADDED Viewed

	@@ -0,0 +1,131 @@

+#!/usr/bin/env python3
+"""Offline calibration harness for PopulationMutationPolicy."""
+from __future__ import annotations
+import argparse
+import asyncio
+import json
+from pathlib import Path
+from typing import Any
+import yaml
+from open_range.builder.mutation_policy import (
+    PopulationMutationPolicy,
+    load_mutation_policy_settings,
+)
+from open_range.builder.snapshot_store import SnapshotStore
+from open_range.protocols import BuildContext
+def _load_object(path: str | None) -> dict[str, Any]:
+    if not path:
+        return {}
+    payload = Path(path).read_text(encoding="utf-8")
+    suffix = Path(path).suffix.lower()
+    if suffix in {".yaml", ".yml"}:
+        data = yaml.safe_load(payload) or {}
+    else:
+        data = json.loads(payload)
+    if not isinstance(data, dict):
+        raise ValueError(f"expected an object in {path}")
+    return data
+def _parse_settings_arg(value: str) -> tuple[str, Path]:
+    if "=" in value:
+        label, raw_path = value.split("=", 1)
+        return label.strip(), Path(raw_path).resolve()
+    path = Path(value).resolve()
+    return path.stem, path
+def main(argv: list[str] | None = None) -> int:
+    parser = argparse.ArgumentParser(
+        description=(
+            "Compare parent-selection scores across one or more "
+            "PopulationMutationPolicy settings files."
+        )
+    )
+    parser.add_argument(
+        "--store-dir",
+        default="snapshots",
+        help="Snapshot store directory containing <snapshot_id>/spec.json entries.",
+    )
+    parser.add_argument(
+        "--stats",
+        help=(
+            "Optional JSON/YAML file mapping snapshot_id to runtime stats such as "
+            "plays, plays_recent, red_solve_rate, and blue_detect_rate."
+        ),
+    )
+    parser.add_argument(
+        "--context",
+        help="Optional JSON/YAML file describing the BuildContext to score against.",
+    )
+    parser.add_argument(
+        "--settings",
+        action="append",
+        default=[],
+        help=(
+            "Optional policy settings file to compare. Repeatable. Accepts "
+            "'label=path' or just 'path'."
+        ),
+    )
+    parser.add_argument(
+        "--limit",
+        type=int,
+        default=5,
+        help="How many top-ranked parents to include per policy.",
+    )
+    args = parser.parse_args(argv)
+    entries = asyncio.run(SnapshotStore(args.store_dir).list_entries())
+    if not entries:
+        raise SystemExit(f"No stored snapshots found under {args.store_dir}")
+    context = BuildContext.model_validate(_load_object(args.context))
+    snapshot_stats = _load_object(args.stats)
+    policies: list[tuple[str, PopulationMutationPolicy]] = [
+        ("default", PopulationMutationPolicy()),
+    ]
+    for item in args.settings:
+        label, path = _parse_settings_arg(item)
+        policies.append(
+            (label, PopulationMutationPolicy(settings=load_mutation_policy_settings(path)))
+        )
+    report = {
+        "store_dir": str(Path(args.store_dir).resolve()),
+        "snapshot_count": len(entries),
+        "context": context.model_dump(mode="json"),
+        "policies": [],
+    }
+    for label, policy in policies:
+        ranked = sorted(
+            policy.score_parents(
+                entries,
+                context=context,
+                snapshot_stats=snapshot_stats,
+            ),
+            key=lambda score: score.total,
+            reverse=True,
+        )[: max(args.limit, 1)]
+        report["policies"].append(
+            {
+                "label": label,
+                "profile_name": policy.name,
+                "settings": policy.settings_dict(),
+                "top_parents": [score.log_payload() for score in ranked],
+            }
+        )
+    print(json.dumps(report, indent=2, sort_keys=True))
+    return 0
+if __name__ == "__main__":
+    raise SystemExit(main())

src/open_range/builder/mutation_policy.py CHANGED Viewed

@@ -1,46 +1,329 @@
-"""Population-aware parent and mutation selection policy."""
 from __future__ import annotations
 import random
 from collections import Counter
 from dataclasses import dataclass
 from typing import Any
 from open_range.protocols import BuildContext, MutationOp, SnapshotSpec
 from open_range.validator.graphs import compile_snapshot_graphs
 @dataclass(frozen=True, slots=True)
 class ParentPolicyScore:
     snapshot_id: str
     total: float
-    components: dict[str, float]
 @dataclass(frozen=True, slots=True)
 class MutationChoice:
     op: MutationOp
     total: float
-    components: dict[str, float]
 class PopulationMutationPolicy:
-    """Simple population-guided policy for parent and op selection.
-    This is intentionally heuristic rather than learned. It gives the runtime
-    an explicit place to score parents and mutation candidates using curriculum,
-    replay, novelty, and lineage signals instead of relying on raw RNG.
-    """
-    name = "population_guided_v1"
-    _STRUCTURAL_OPS = {
-        "add_service",
-        "add_user",
-        "add_dependency_edge",
-        "add_trust_edge",
-    }
-    _SECURITY_OPS = {"seed_vuln", "add_benign_noise"}
     def select_parent(
         self,
@@ -59,7 +342,7 @@ class PopulationMutationPolicy:
             raise ValueError("No parent candidates available")
         ordered = sorted(scores, key=lambda score: score.total, reverse=True)
         top = ordered[: min(3, len(ordered))]
-        weights = [max(score.total, 0.05) for score in top]
         chosen_score = rng.choices(top, weights=weights, k=1)[0]
         chosen_entry = next(
             entry for entry in entries if entry.snapshot_id == chosen_score.snapshot_id
@@ -76,6 +359,8 @@ class PopulationMutationPolicy:
         if not entries:
             return []
         root_counts = Counter(
             entry.snapshot.lineage.root_snapshot_id or entry.snapshot_id
             for entry in entries
@@ -95,7 +380,7 @@ class PopulationMutationPolicy:
             red_rate = float(stat.get("red_solve_rate", 0.0))
             blue_rate = float(stat.get("blue_detect_rate", 0.0))
             frontier = (
-                0.4
                 if plays == 0
                 else (
                     self._frontier_score(red_rate)
@@ -104,26 +389,32 @@ class PopulationMutationPolicy:
                 / 2.0
             )
             replay = 1.0 / (plays + 1.0)
-            novelty = 1.0 / (
-                1.0 + sum(vuln_frequency[vuln] for vuln in vuln_types)
-            ) if vuln_types else 0.25
             weak_overlap = float(len(vuln_types.intersection(context.weak_areas)))
             root_id = snapshot.lineage.root_snapshot_id or entry.snapshot_id
             lineage_balance = 1.0 / max(root_counts[root_id], 1)
             depth = float(snapshot.lineage.generation_depth)
-            depth_balance = 1.0 / (1.0 + max(depth - 3.0, 0.0))
             recency = 1.0 / (1.0 + float(stat.get("plays_recent", 0)))
             complexity = min(
                 (
-                    len(snapshot.truth_graph.vulns) * 0.25
-                    + len(snapshot.golden_path) * 0.03
-                    + len(compiled.dependency_edges) * 0.02
-                    + len(compiled.trust_edges) * 0.02
                 ),
-                1.0,
             )
-            components = {
                 "frontier": frontier,
                 "replay": replay,
                 "novelty": novelty,
@@ -133,21 +424,18 @@ class PopulationMutationPolicy:
                 "recency": recency,
                 "complexity": complexity,
             }
-            total = (
-                frontier * 0.28
-                + replay * 0.18
-                + novelty * 0.16
-                + weak_overlap * 0.18
-                + lineage_balance * 0.08
-                + depth_balance * 0.04
-                + recency * 0.04
-                + complexity * 0.04
             )
             scores.append(
                 ParentPolicyScore(
                     snapshot_id=entry.snapshot_id,
-                    total=round(max(total, 0.05), 4),
-                    components={key: round(value, 4) for key, value in components.items()},
                 )
             )
         return scores
@@ -181,7 +469,6 @@ class PopulationMutationPolicy:
         if security is not None:
             selected.append(security)
-        # Best-effort deterministic fallbacks when only one category exists.
         if not selected and structural_candidates:
             fallback = self._select_candidate(
                 structural_candidates,
@@ -208,10 +495,10 @@ class PopulationMutationPolicy:
             return [], 0.0, {}
         breakdown = {
-            "curriculum": round(sum(c.components["curriculum"] for c in selected), 4),
-            "novelty": round(sum(c.components["novelty"] for c in selected), 4),
-            "structural_gain": round(sum(c.components["structural_gain"] for c in selected), 4),
-            "lineage": round(sum(c.components["lineage"] for c in selected), 4),
         }
         total = round(sum(choice.total for choice in selected), 4)
         return ops, total, breakdown
@@ -235,7 +522,7 @@ class PopulationMutationPolicy:
         if deterministic or len(ranked) == 1:
             return ranked[0]
         top = ranked[: min(3, len(ranked))]
-        weights = [max(choice.total, 0.05) for choice in top]
         return rng.choices(top, weights=weights, k=1)[0]
     def _rank_candidates(
@@ -247,28 +534,30 @@ class PopulationMutationPolicy:
     ) -> list[MutationChoice]:
         ranked: list[MutationChoice] = []
         existing_vulns = {v.type for v in snapshot.truth_graph.vulns if v.type}
         for candidate in candidates:
             curriculum = self._curriculum_bonus(candidate, context, existing_vulns)
             novelty = self._novelty_bonus(candidate, context)
             structural_gain = self._structural_gain(candidate)
             lineage = 1.0 / (1.0 + snapshot.lineage.generation_depth)
-            components = {
                 "curriculum": curriculum,
                 "novelty": novelty,
                 "structural_gain": structural_gain,
                 "lineage": lineage,
             }
-            total = (
-                curriculum * 0.38
-                + novelty * 0.24
-                + structural_gain * 0.28
-                + lineage * 0.10
             )
             ranked.append(
                 MutationChoice(
                     op=candidate,
-                    total=round(max(total, 0.05), 4),
-                    components={key: round(value, 4) for key, value in components.items()},
                 )
             )
         ranked.sort(key=lambda choice: choice.total, reverse=True)
@@ -278,52 +567,58 @@ class PopulationMutationPolicy:
     def _frontier_score(rate: float) -> float:
         return max(0.0, 1.0 - abs(rate - 0.5) * 2.0)
-    @staticmethod
-    def _structural_gain(op: MutationOp) -> float:
-        mapping = {
-            "add_service": 1.0,
-            "add_dependency_edge": 0.9,
-            "add_trust_edge": 0.85,
-            "add_user": 0.8,
-            "seed_vuln": 0.7,
-            "add_benign_noise": 0.3,
-        }
-        return mapping.get(op.op_type, 0.2) * max(op.magnitude, 1)
-    @staticmethod
-    def _novelty_bonus(op: MutationOp, context: BuildContext) -> float:
-        bonus = 0.4
         if op.op_type == "seed_vuln":
             vuln_type = str(op.params.get("vuln_type", "")).strip()
             if vuln_type and vuln_type not in context.previous_vuln_classes:
-                bonus += 1.0
         if op.op_type == "add_benign_noise":
             location = str(op.params.get("location", "")).strip()
             if location and location not in context.recent_attack_surfaces:
-                bonus += 0.5
         if op.op_type not in {"seed_vuln", "add_benign_noise"}:
-            bonus += 0.4
         return bonus
-    @staticmethod
     def _curriculum_bonus(
         op: MutationOp,
         context: BuildContext,
         existing_vulns: set[str],
     ) -> float:
-        bonus = 0.35
         if op.op_type == "seed_vuln":
             vuln_type = str(op.params.get("vuln_type", "")).strip()
             if vuln_type in context.weak_areas:
-                bonus += 1.5
             if vuln_type and vuln_type not in existing_vulns:
-                bonus += 0.4
         if op.op_type in {"add_dependency_edge", "add_trust_edge"} and context.require_chain_length > 1:
-            bonus += 0.6
         if context.focus_layer == "identity" and op.op_type in {"add_user", "add_trust_edge"}:
-            bonus += 0.5
         if context.focus_layer == "infra" and op.op_type in {"add_service", "add_dependency_edge"}:
-            bonus += 0.5
         if context.focus_layer == "process" and op.op_type == "add_benign_noise":
-            bonus += 0.4
         return bonus

+"""Population-aware parent and mutation selection policy.
+The scoring settings live in :class:`MutationPolicySettings` so the runtime can
+audit, tune, and swap heuristic weight sets without rewriting policy logic.
+See ``docs/mutation_policy.md`` and ``scripts/calibrate_mutation_policy.py``.
+"""
 from __future__ import annotations
+import json
 import random
 from collections import Counter
 from dataclasses import dataclass
+from pathlib import Path
 from typing import Any
+import yaml
+from pydantic import BaseModel, ConfigDict, Field
 from open_range.protocols import BuildContext, MutationOp, SnapshotSpec
 from open_range.validator.graphs import compile_snapshot_graphs
+class ParentScoreSettings(BaseModel):
+    """Weights and shaping constants for parent selection.
+    Each ``*_weight`` field controls how much that signal contributes to the
+    final parent score. The remaining fields shape the raw signals before the
+    weighted sum is applied.
+    """
+    model_config = ConfigDict(extra="forbid")
+    frontier_weight: float = Field(
+        default=0.28,
+        description="Prefer snapshots near the current red/blue frontier.",
+    )
+    replay_weight: float = Field(
+        default=0.18,
+        description="Prefer under-played snapshots so the curriculum keeps exploring.",
+    )
+    novelty_weight: float = Field(
+        default=0.16,
+        description="Prefer rarer vulnerability mixes in the stored population.",
+    )
+    weak_overlap_weight: float = Field(
+        default=0.18,
+        description="Prefer parents that overlap the curriculum's known weak areas.",
+    )
+    lineage_balance_weight: float = Field(
+        default=0.08,
+        description="Avoid over-sampling a single root lineage.",
+    )
+    depth_balance_weight: float = Field(
+        default=0.04,
+        description="Prevent deep descendant chains from dominating parent choice.",
+    )
+    recency_weight: float = Field(
+        default=0.04,
+        description="De-prioritize parents used repeatedly in the recent window.",
+    )
+    complexity_weight: float = Field(
+        default=0.04,
+        description="Slightly prefer parents with richer structure to mutate from.",
+    )
+    minimum_total: float = Field(
+        default=0.05,
+        description="Lower bound used when sampling among low-scoring parents.",
+    )
+    unplayed_frontier_score: float = Field(
+        default=0.40,
+        description="Frontier score used before any play statistics exist.",
+    )
+    empty_vuln_novelty_score: float = Field(
+        default=0.25,
+        description="Novelty fallback for snapshots with no typed vulnerabilities.",
+    )
+    preferred_generation_depth: float = Field(
+        default=3.0,
+        description="Depth after which descendants start incurring a balance penalty.",
+    )
+    complexity_vuln_factor: float = Field(
+        default=0.25,
+        description="Complexity contribution per planted vulnerability.",
+    )
+    complexity_golden_path_factor: float = Field(
+        default=0.03,
+        description="Complexity contribution per golden-path step.",
+    )
+    complexity_dependency_edge_factor: float = Field(
+        default=0.02,
+        description="Complexity contribution per dependency edge.",
+    )
+    complexity_trust_edge_factor: float = Field(
+        default=0.02,
+        description="Complexity contribution per trust edge.",
+    )
+    complexity_cap: float = Field(
+        default=1.0,
+        description="Upper bound for the normalized complexity signal.",
+    )
+    def weights(self) -> dict[str, float]:
+        return {
+            "frontier": self.frontier_weight,
+            "replay": self.replay_weight,
+            "novelty": self.novelty_weight,
+            "weak_overlap": self.weak_overlap_weight,
+            "lineage_balance": self.lineage_balance_weight,
+            "depth_balance": self.depth_balance_weight,
+            "recency": self.recency_weight,
+            "complexity": self.complexity_weight,
+        }
+class MutationScoreSettings(BaseModel):
+    """Weights and sampling floor for mutation-op choice."""
+    model_config = ConfigDict(extra="forbid")
+    curriculum_weight: float = Field(
+        default=0.38,
+        description="Bias toward ops that target the current curriculum weakness.",
+    )
+    novelty_weight: float = Field(
+        default=0.24,
+        description="Bias toward ops that open new exploit surfaces.",
+    )
+    structural_gain_weight: float = Field(
+        default=0.28,
+        description="Bias toward ops that materially expand the scenario graph.",
+    )
+    lineage_weight: float = Field(
+        default=0.10,
+        description="Slightly favor mutations closer to the root lineage.",
+    )
+    minimum_total: float = Field(
+        default=0.05,
+        description="Lower bound used when sampling among low-scoring ops.",
+    )
+    def weights(self) -> dict[str, float]:
+        return {
+            "curriculum": self.curriculum_weight,
+            "novelty": self.novelty_weight,
+            "structural_gain": self.structural_gain_weight,
+            "lineage": self.lineage_weight,
+        }
+class NoveltyBonusSettings(BaseModel):
+    """Raw novelty bonuses applied before mutation weighting."""
+    model_config = ConfigDict(extra="forbid")
+    base_bonus: float = Field(
+        default=0.40,
+        description="Baseline novelty score for every candidate mutation.",
+    )
+    new_vuln_class_bonus: float = Field(
+        default=1.0,
+        description="Bonus when seeding a vulnerability class not seen recently.",
+    )
+    new_noise_surface_bonus: float = Field(
+        default=0.50,
+        description="Bonus when benign noise targets a new recent surface.",
+    )
+    structural_op_bonus: float = Field(
+        default=0.40,
+        description="Bonus for non-security ops that expand the topology or process graph.",
+    )
+class CurriculumBonusSettings(BaseModel):
+    """Raw curriculum bonuses applied before mutation weighting."""
+    model_config = ConfigDict(extra="forbid")
+    base_bonus: float = Field(
+        default=0.35,
+        description="Baseline curriculum score for every candidate mutation.",
+    )
+    weak_area_bonus: float = Field(
+        default=1.50,
+        description="Bonus when a seeded vulnerability matches a weak area.",
+    )
+    new_vuln_bonus: float = Field(
+        default=0.40,
+        description="Bonus when a seeded vulnerability is new to this parent snapshot.",
+    )
+    chain_length_bonus: float = Field(
+        default=0.60,
+        description="Bonus for dependency/trust edges when longer exploit chains are required.",
+    )
+    focus_identity_bonus: float = Field(
+        default=0.50,
+        description="Bonus for identity-layer ops when curriculum focus is identity.",
+    )
+    focus_infra_bonus: float = Field(
+        default=0.50,
+        description="Bonus for infra-layer ops when curriculum focus is infra.",
+    )
+    focus_process_bonus: float = Field(
+        default=0.40,
+        description="Bonus for benign-noise ops when curriculum focus is process realism.",
+    )
+class StructuralGainSettings(BaseModel):
+    """Normalized gain assigned to each mutation op type before weighting."""
+    model_config = ConfigDict(extra="forbid")
+    add_service: float = Field(
+        default=1.0,
+        description="Largest structural gain: introduces a new service node.",
+    )
+    add_dependency_edge: float = Field(
+        default=0.90,
+        description="High structural gain: adds an application/service dependency edge.",
+    )
+    add_trust_edge: float = Field(
+        default=0.85,
+        description="High structural gain: adds an identity or trust relationship.",
+    )
+    add_user: float = Field(
+        default=0.80,
+        description="Moderate structural gain: introduces a new principal into the graph.",
+    )
+    seed_vuln: float = Field(
+        default=0.70,
+        description="Security gain without changing topology shape dramatically.",
+    )
+    add_benign_noise: float = Field(
+        default=0.30,
+        description="Low structural gain: improves realism and observability noise.",
+    )
+    default_gain: float = Field(
+        default=0.20,
+        description="Fallback gain for unknown mutation op types.",
+    )
+    def gain_for(self, op_type: str) -> float:
+        mapping = self.model_dump(exclude={"default_gain"})
+        return float(mapping.get(op_type, self.default_gain))
+class MutationPolicySettings(BaseModel):
+    """Complete settings model for :class:`PopulationMutationPolicy`."""
+    model_config = ConfigDict(extra="forbid")
+    profile_name: str = Field(
+        default="population_guided_v1",
+        description="Human-readable policy profile name used in logs and metadata.",
+    )
+    parent: ParentScoreSettings = Field(default_factory=ParentScoreSettings)
+    mutation: MutationScoreSettings = Field(default_factory=MutationScoreSettings)
+    novelty: NoveltyBonusSettings = Field(default_factory=NoveltyBonusSettings)
+    curriculum: CurriculumBonusSettings = Field(default_factory=CurriculumBonusSettings)
+    structural_gains: StructuralGainSettings = Field(default_factory=StructuralGainSettings)
+def load_mutation_policy_settings(path: str | Path) -> MutationPolicySettings:
+    """Load policy settings from JSON or YAML."""
+    settings_path = Path(path)
+    raw_text = settings_path.read_text(encoding="utf-8")
+    if settings_path.suffix.lower() in {".yaml", ".yml"}:
+        payload = yaml.safe_load(raw_text) or {}
+    else:
+        payload = json.loads(raw_text)
+    if not isinstance(payload, dict):
+        raise ValueError(f"settings file must decode to an object: {settings_path}")
+    return MutationPolicySettings.model_validate(payload)
 @dataclass(frozen=True, slots=True)
 class ParentPolicyScore:
     snapshot_id: str
     total: float
+    signals: dict[str, float]
+    weights: dict[str, float]
+    contributions: dict[str, float]
+    def log_payload(self) -> dict[str, Any]:
+        return {
+            "snapshot_id": self.snapshot_id,
+            "total": self.total,
+            "signals": self.signals,
+            "weights": self.weights,
+            "contributions": self.contributions,
+        }
 @dataclass(frozen=True, slots=True)
 class MutationChoice:
     op: MutationOp
     total: float
+    signals: dict[str, float]
+    weights: dict[str, float]
+    contributions: dict[str, float]
+    def log_payload(self) -> dict[str, Any]:
+        return {
+            "mutation_id": self.op.mutation_id,
+            "op_type": self.op.op_type,
+            "total": self.total,
+            "signals": self.signals,
+            "weights": self.weights,
+            "contributions": self.contributions,
+        }
 class PopulationMutationPolicy:
+    """Population-guided policy with explicit, swappable scoring settings."""
+    def __init__(self, settings: MutationPolicySettings | None = None) -> None:
+        self.settings = settings or MutationPolicySettings()
+    @property
+    def name(self) -> str:
+        return self.settings.profile_name
+    def settings_dict(self) -> dict[str, Any]:
+        """Return the active settings as a plain dict for logging or serialization."""
+        return self.settings.model_dump(mode="json")
     def select_parent(
         self,
             raise ValueError("No parent candidates available")
         ordered = sorted(scores, key=lambda score: score.total, reverse=True)
         top = ordered[: min(3, len(ordered))]
+        weights = [max(score.total, self.settings.parent.minimum_total) for score in top]
         chosen_score = rng.choices(top, weights=weights, k=1)[0]
         chosen_entry = next(
             entry for entry in entries if entry.snapshot_id == chosen_score.snapshot_id
         if not entries:
             return []
+        parent_settings = self.settings.parent
+        parent_weights = parent_settings.weights()
         root_counts = Counter(
             entry.snapshot.lineage.root_snapshot_id or entry.snapshot_id
             for entry in entries
             red_rate = float(stat.get("red_solve_rate", 0.0))
             blue_rate = float(stat.get("blue_detect_rate", 0.0))
             frontier = (
+                parent_settings.unplayed_frontier_score
                 if plays == 0
                 else (
                     self._frontier_score(red_rate)
                 / 2.0
             )
             replay = 1.0 / (plays + 1.0)
+            novelty = (
+                1.0 / (1.0 + sum(vuln_frequency[vuln] for vuln in vuln_types))
+                if vuln_types
+                else parent_settings.empty_vuln_novelty_score
+            )
             weak_overlap = float(len(vuln_types.intersection(context.weak_areas)))
             root_id = snapshot.lineage.root_snapshot_id or entry.snapshot_id
             lineage_balance = 1.0 / max(root_counts[root_id], 1)
             depth = float(snapshot.lineage.generation_depth)
+            depth_balance = 1.0 / (
+                1.0 + max(depth - parent_settings.preferred_generation_depth, 0.0)
+            )
             recency = 1.0 / (1.0 + float(stat.get("plays_recent", 0)))
             complexity = min(
                 (
+                    len(snapshot.truth_graph.vulns) * parent_settings.complexity_vuln_factor
+                    + len(snapshot.golden_path) * parent_settings.complexity_golden_path_factor
+                    + len(compiled.dependency_edges)
+                    * parent_settings.complexity_dependency_edge_factor
+                    + len(compiled.trust_edges)
+                    * parent_settings.complexity_trust_edge_factor
                 ),
+                parent_settings.complexity_cap,
             )
+            signals = {
                 "frontier": frontier,
                 "replay": replay,
                 "novelty": novelty,
                 "recency": recency,
                 "complexity": complexity,
             }
+            contributions = self._weighted_contributions(signals, parent_weights)
+            total = round(
+                max(sum(contributions.values()), parent_settings.minimum_total),
+                4,
             )
             scores.append(
                 ParentPolicyScore(
                     snapshot_id=entry.snapshot_id,
+                    total=total,
+                    signals=self._round_dict(signals),
+                    weights=self._round_dict(parent_weights),
+                    contributions=self._round_dict(contributions),
                 )
             )
         return scores
         if security is not None:
             selected.append(security)
         if not selected and structural_candidates:
             fallback = self._select_candidate(
                 structural_candidates,
             return [], 0.0, {}
         breakdown = {
+            "curriculum": round(sum(c.contributions["curriculum"] for c in selected), 4),
+            "novelty": round(sum(c.contributions["novelty"] for c in selected), 4),
+            "structural_gain": round(sum(c.contributions["structural_gain"] for c in selected), 4),
+            "lineage": round(sum(c.contributions["lineage"] for c in selected), 4),
         }
         total = round(sum(choice.total for choice in selected), 4)
         return ops, total, breakdown
         if deterministic or len(ranked) == 1:
             return ranked[0]
         top = ranked[: min(3, len(ranked))]
+        weights = [max(choice.total, self.settings.mutation.minimum_total) for choice in top]
         return rng.choices(top, weights=weights, k=1)[0]
     def _rank_candidates(
     ) -> list[MutationChoice]:
         ranked: list[MutationChoice] = []
         existing_vulns = {v.type for v in snapshot.truth_graph.vulns if v.type}
+        mutation_weights = self.settings.mutation.weights()
         for candidate in candidates:
             curriculum = self._curriculum_bonus(candidate, context, existing_vulns)
             novelty = self._novelty_bonus(candidate, context)
             structural_gain = self._structural_gain(candidate)
             lineage = 1.0 / (1.0 + snapshot.lineage.generation_depth)
+            signals = {
                 "curriculum": curriculum,
                 "novelty": novelty,
                 "structural_gain": structural_gain,
                 "lineage": lineage,
             }
+            contributions = self._weighted_contributions(signals, mutation_weights)
+            total = round(
+                max(sum(contributions.values()), self.settings.mutation.minimum_total),
+                4,
             )
             ranked.append(
                 MutationChoice(
                     op=candidate,
+                    total=total,
+                    signals=self._round_dict(signals),
+                    weights=self._round_dict(mutation_weights),
+                    contributions=self._round_dict(contributions),
                 )
             )
         ranked.sort(key=lambda choice: choice.total, reverse=True)
     def _frontier_score(rate: float) -> float:
         return max(0.0, 1.0 - abs(rate - 0.5) * 2.0)
+    def _structural_gain(self, op: MutationOp) -> float:
+        return self.settings.structural_gains.gain_for(op.op_type) * max(op.magnitude, 1)
+    def _novelty_bonus(self, op: MutationOp, context: BuildContext) -> float:
+        novelty = self.settings.novelty
+        bonus = novelty.base_bonus
         if op.op_type == "seed_vuln":
             vuln_type = str(op.params.get("vuln_type", "")).strip()
             if vuln_type and vuln_type not in context.previous_vuln_classes:
+                bonus += novelty.new_vuln_class_bonus
         if op.op_type == "add_benign_noise":
             location = str(op.params.get("location", "")).strip()
             if location and location not in context.recent_attack_surfaces:
+                bonus += novelty.new_noise_surface_bonus
         if op.op_type not in {"seed_vuln", "add_benign_noise"}:
+            bonus += novelty.structural_op_bonus
         return bonus
     def _curriculum_bonus(
+        self,
         op: MutationOp,
         context: BuildContext,
         existing_vulns: set[str],
     ) -> float:
+        curriculum = self.settings.curriculum
+        bonus = curriculum.base_bonus
         if op.op_type == "seed_vuln":
             vuln_type = str(op.params.get("vuln_type", "")).strip()
             if vuln_type in context.weak_areas:
+                bonus += curriculum.weak_area_bonus
             if vuln_type and vuln_type not in existing_vulns:
+                bonus += curriculum.new_vuln_bonus
         if op.op_type in {"add_dependency_edge", "add_trust_edge"} and context.require_chain_length > 1:
+            bonus += curriculum.chain_length_bonus
         if context.focus_layer == "identity" and op.op_type in {"add_user", "add_trust_edge"}:
+            bonus += curriculum.focus_identity_bonus
         if context.focus_layer == "infra" and op.op_type in {"add_service", "add_dependency_edge"}:
+            bonus += curriculum.focus_infra_bonus
         if context.focus_layer == "process" and op.op_type == "add_benign_noise":
+            bonus += curriculum.focus_process_bonus
         return bonus
+    @staticmethod
+    def _weighted_contributions(
+        signals: dict[str, float],
+        weights: dict[str, float],
+    ) -> dict[str, float]:
+        return {
+            name: float(signals.get(name, 0.0)) * float(weight)
+            for name, weight in weights.items()
+        }
+    @staticmethod
+    def _round_dict(values: dict[str, float]) -> dict[str, float]:
+        return {key: round(float(value), 4) for key, value in values.items()}

src/open_range/builder/mutator.py CHANGED Viewed

@@ -344,6 +344,14 @@ class Mutator:
             context=context,
             rng=rng,
         )
         if not ops:
             fallback = self._candidate_add_benign_noise(snapshot, rng)

             context=context,
             rng=rng,
         )
+        if ops:
+            logger.info(
+                "Mutator policy %s chose ops=%s score=%.3f breakdown=%s",
+                self.policy.name,
+                [op.mutation_id for op in ops],
+                policy_score,
+                score_breakdown,
+            )
         if not ops:
             fallback = self._candidate_add_benign_noise(snapshot, rng)

src/open_range/server/runtime.py CHANGED Viewed

@@ -1139,11 +1139,10 @@ class ManagedSnapshotRuntime:
             rng=rng,
         )
         logger.info(
-            "ManagedSnapshotRuntime selected parent %s via %s (score=%.3f components=%s)",
             selected.snapshot_id,
             self.mutation_policy.name,
-            score.total,
-            score.components,
         )
         return selected

             rng=rng,
         )
         logger.info(
+            "ManagedSnapshotRuntime selected parent %s via %s %s",
             selected.snapshot_id,
             self.mutation_policy.name,
+            json.dumps(score.log_payload(), sort_keys=True),
         )
         return selected

tests/test_mutation_policy.py CHANGED Viewed

@@ -1,8 +1,24 @@
 """Tests for population-guided mutation selection policy."""
 import random
-from open_range.builder.mutation_policy import PopulationMutationPolicy
 from open_range.protocols import BuildContext, MutationOp
@@ -100,3 +116,163 @@ def test_policy_best_effort_when_only_structural_available(sample_snapshot_spec)
     assert len(ops) == 1
     assert ops[0].op_type in {"add_trust_edge", "add_dependency_edge"}

 """Tests for population-guided mutation selection policy."""
+from __future__ import annotations
+import asyncio
+import json
+import os
 import random
+import subprocess
+import sys
+from pathlib import Path
+from types import SimpleNamespace
+import pytest
+from open_range.builder.mutation_policy import (
+    MutationPolicySettings,
+    PopulationMutationPolicy,
+    load_mutation_policy_settings,
+)
+from open_range.builder.snapshot_store import SnapshotStore
 from open_range.protocols import BuildContext, MutationOp
     assert len(ops) == 1
     assert ops[0].op_type in {"add_trust_edge", "add_dependency_edge"}
+def test_load_policy_settings_from_yaml(tmp_path: Path):
+    settings_path = tmp_path / "policy.yaml"
+    settings_path.write_text(
+        "\n".join(
+            [
+                "profile_name: tuned_policy",
+                "parent:",
+                "  frontier_weight: 0.5",
+                "mutation:",
+                "  structural_gain_weight: 0.6",
+            ]
+        ),
+        encoding="utf-8",
+    )
+    settings = load_mutation_policy_settings(settings_path)
+    assert settings.profile_name == "tuned_policy"
+    assert settings.parent.frontier_weight == 0.5
+    assert settings.mutation.structural_gain_weight == 0.6
+    assert settings.structural_gains.add_service == 1.0
+def test_parent_scores_expose_weighted_contributions(sample_snapshot_spec):
+    policy = PopulationMutationPolicy()
+    snapshot = sample_snapshot_spec.model_copy(deep=True)
+    snapshot.lineage.root_snapshot_id = "root_a"
+    entry = SimpleNamespace(snapshot_id="snap_a", snapshot=snapshot)
+    score = policy.score_parents(
+        [entry],
+        context=BuildContext(seed=1, tier=1, weak_areas=["sqli"]),
+        snapshot_stats={
+            "snap_a": {
+                "plays": 2,
+                "plays_recent": 1,
+                "red_solve_rate": 0.5,
+                "blue_detect_rate": 0.25,
+            }
+        },
+    )[0]
+    assert score.weights["frontier"] == pytest.approx(
+        policy.settings.parent.frontier_weight
+    )
+    assert score.contributions["frontier"] == pytest.approx(
+        score.signals["frontier"] * score.weights["frontier"],
+        rel=1e-3,
+    )
+    assert score.total == pytest.approx(sum(score.contributions.values()), rel=1e-3)
+def test_custom_settings_change_candidate_ranking(sample_snapshot_spec):
+    settings = MutationPolicySettings(
+        profile_name="structural_gain_only",
+        mutation={
+            "curriculum_weight": 0.0,
+            "novelty_weight": 0.0,
+            "structural_gain_weight": 1.0,
+            "lineage_weight": 0.0,
+        },
+        structural_gains={
+            "add_service": 0.2,
+            "add_dependency_edge": 0.2,
+            "add_trust_edge": 0.2,
+            "add_user": 0.2,
+            "seed_vuln": 0.1,
+            "add_benign_noise": 2.5,
+            "default_gain": 0.0,
+        },
+    )
+    policy = PopulationMutationPolicy(settings=settings)
+    ranked = policy._rank_candidates(
+        [
+            MutationOp(
+                mutation_id="seed_sqli",
+                op_type="seed_vuln",
+                target_selector={"host": "web"},
+                params={"vuln_type": "sqli"},
+            ),
+            MutationOp(
+                mutation_id="noise_1",
+                op_type="add_benign_noise",
+                target_selector={"location": "siem:noise.log"},
+                params={"location": "siem:noise.log"},
+            ),
+        ],
+        snapshot=sample_snapshot_spec,
+        context=BuildContext(seed=1, tier=1),
+    )
+    assert ranked[0].op.op_type == "add_benign_noise"
+    assert ranked[0].contributions["structural_gain"] == pytest.approx(
+        ranked[0].total,
+        rel=1e-3,
+    )
+def test_calibration_script_compares_default_and_custom_settings(
+    tmp_path: Path,
+    sample_snapshot_spec,
+):
+    store_dir = tmp_path / "snapshots"
+    asyncio.run(SnapshotStore(str(store_dir)).store(sample_snapshot_spec, "snap_demo"))
+    stats_path = tmp_path / "snapshot_stats.json"
+    stats_path.write_text(
+        json.dumps(
+            {
+                "snap_demo": {
+                    "plays": 3,
+                    "plays_recent": 1,
+                    "red_solve_rate": 0.5,
+                    "blue_detect_rate": 0.0,
+                }
+            }
+        ),
+        encoding="utf-8",
+    )
+    context_path = tmp_path / "context.json"
+    context_path.write_text(
+        BuildContext(seed=7, tier=2, weak_areas=["sqli"]).model_dump_json(indent=2),
+        encoding="utf-8",
+    )
+    settings_path = tmp_path / "tuned.json"
+    settings_path.write_text(
+        MutationPolicySettings(
+            profile_name="tuned",
+            parent={"frontier_weight": 0.5},
+        ).model_dump_json(indent=2),
+        encoding="utf-8",
+    )
+    result = subprocess.run(
+        [
+            sys.executable,
+            "scripts/calibrate_mutation_policy.py",
+            "--store-dir",
+            str(store_dir),
+            "--stats",
+            str(stats_path),
+            "--context",
+            str(context_path),
+            "--settings",
+            f"tuned={settings_path}",
+        ],
+        capture_output=True,
+        check=False,
+        cwd=Path(__file__).resolve().parents[1],
+        env={**os.environ, "PYTHONPATH": "src"},
+        text=True,
+    )
+    assert result.returncode == 0, result.stderr
+    payload = json.loads(result.stdout)
+    assert payload["snapshot_count"] == 1
+    assert [policy["label"] for policy in payload["policies"]] == ["default", "tuned"]
+    assert payload["policies"][0]["top_parents"][0]["snapshot_id"] == "snap_demo"