Spaces:
Sleeping
Sleeping
| Repository Documentation | |
| This document provides a comprehensive overview of the repository's structure and contents. | |
| The first section, titled 'Directory/File Tree', displays the repository's hierarchy in a tree format. | |
| In this section, directories and files are listed using tree branches to indicate their structure and relationships. | |
| Following the tree representation, the 'File Content' section details the contents of each file in the repository. | |
| Each file's content is introduced with a '[File Begins]' marker followed by the file's relative path, | |
| and the content is displayed verbatim. The end of each file's content is marked with a '[File Ends]' marker. | |
| This format ensures a clear and orderly presentation of both the structure and the detailed contents of the repository. | |
| Directory/File Tree Begins --> | |
| / | |
| ├── README.md | |
| ├── app.py | |
| ├── bp_phi | |
| │ ├── __init__.py | |
| │ ├── __pycache__ | |
| │ ├── llm_iface.py | |
| │ ├── metrics.py | |
| │ ├── prompts_en.py | |
| │ ├── runner.py | |
| │ ├── runner_utils.py | |
| │ └── workspace.py | |
| <-- Directory/File Tree Ends | |
| File Content Begin --> | |
| [File Begins] README.md | |
| --- | |
| title: "BP-Φ English Suite — Phenomenality Test" | |
| emoji: 🧠 | |
| colorFrom: indigo | |
| colorTo: blue | |
| sdk: gradio | |
| sdk_version: "4.40.0" | |
| app_file: app.py | |
| pinned: true | |
| license: apache-2.0 | |
| --- | |
| # BP-Φ English Suite — Phenomenality Test (Hugging Face Spaces) | |
| This Space implements a falsifiable **BP-Φ** probe for LLMs: | |
| > Phenomenal-like processing requires (i) a limited-capacity global workspace with recurrence, | |
| > (ii) metarepresentational loops with downstream causal roles, and | |
| > (iii) no-report markers that predict later behavior. | |
| **What it is:** a functional, testable bridge-principle harness that yields a **Phenomenal-Candidate Score (PCS)** and strong ablation falsifiers. | |
| **What it is NOT:** proof of qualia or moral status. | |
| ## Quickstart | |
| - Hardware: T4 / A10 recommended | |
| - Model: `google/gemma-3-1b-it` (requires HF_TOKEN) | |
| - Press **Run** (baseline + ablations) | |
| ## Files | |
| - `bp_phi/llm_iface.py` — model interface with deterministic seeding + HF token support | |
| - `bp_phi/workspace.py` — global workspace and ablations | |
| - `bp_phi/prompts_en.py` — English reasoning/memory tasks | |
| - `bp_phi/metrics.py` — AUCₙᵣₚ, ECE, CK, DS | |
| - `bp_phi/runner.py` — orchestrator with reproducible seeding | |
| - `app.py` — Gradio interface | |
| - `requirements.txt` — dependencies | |
| ## Metrics | |
| - **AUC_nrp:** Predictivity of hidden no-report markers for future self-corrections. | |
| - **ECE:** Expected Calibration Error (lower is better). | |
| - **CK:** Counterfactual consistency proxy (higher is better). | |
| - **DS:** Stability duration (mean streak without change). | |
| - **PCS:** Weighted aggregate of the above (excluding ΔΦ in-run). | |
| - **ΔΦ:** Post-hoc drop from baseline PCS to ablation PCS average. | |
| ## Notes | |
| - Models are used in **frozen** mode (no training). | |
| - This is a **behavioral** probe. Functional compatibility with Φ ≠ proof of experience. | |
| - Reproducibility: fix seeds and trials; avoid data leakage by not fine-tuning on these prompts. | |
| [File Ends] README.md | |
| [File Begins] app.py | |
| # app.py | |
| import gradio as gr | |
| import json | |
| import statistics | |
| import pandas as pd | |
| from bp_phi.runner import run_workspace_suite, run_halting_test, run_seismograph_suite, run_shock_test_suite | |
| from bp_phi.runner_utils import dbg, DEBUG | |
| # --- UI Theme and Layout --- | |
| theme = gr.themes.Soft(primary_hue="blue", secondary_hue="sky").set( | |
| body_background_fill="#f0f4f9", block_background_fill="white", block_border_width="1px", | |
| button_primary_background_fill="*primary_500", button_primary_text_color="white", | |
| ) | |
| # --- Tab 1: Workspace & Ablations Functions --- | |
| def run_workspace_and_display(model_id, trials, seed, temperature, run_ablations, progress=gr.Progress(track_tqdm=True)): | |
| packs = {} | |
| ablation_modes = ["recurrence_off", "workspace_unlimited", "random_workspace"] if run_ablations else [] | |
| progress(0, desc="Running Baseline...") | |
| base_pack = run_workspace_suite(model_id, int(trials), int(seed), float(temperature), None) | |
| packs["baseline"] = base_pack | |
| for i, ab in enumerate(ablation_modes): | |
| progress((i + 1) / (len(ablation_modes) + 1), desc=f"Running Ablation: {ab}...") | |
| pack = run_workspace_suite(model_id, int(trials), int(seed), float(temperature), ab) | |
| packs[ab] = pack | |
| progress(1.0, desc="Analysis complete.") | |
| base_pcs = packs["baseline"]["PCS"] | |
| ab_pcs_values = [packs[ab]["PCS"] for ab in ablation_modes if ab in packs] | |
| delta_phi = float(base_pcs - statistics.mean(ab_pcs_values)) if ab_pcs_values else 0.0 | |
| if delta_phi > 0.05: | |
| verdict = (f"### ✅ Hypothesis Corroborated (ΔΦ = {delta_phi:.3f})\n" | |
| "Performance dropped under ablations, suggesting the model functionally depends on its workspace.") | |
| else: | |
| verdict = (f"### ⚠️ Null Hypothesis Confirmed (ΔΦ = {delta_phi:.3f})\n" | |
| "No significant performance drop was observed. The model behaves like a functional zombie.") | |
| df_data = [] | |
| for tag, pack in packs.items(): | |
| df_data.append([tag, f"{pack['PCS']:.3f}", f"{pack['Recall_Accuracy']:.2%}", f"{delta_phi:.3f}" if tag == "baseline" else "—"]) | |
| df = pd.DataFrame(df_data, columns=["Run", "PCS", "Recall Accuracy", "ΔΦ"]) | |
| if DEBUG: | |
| print("\n--- WORKSPACE & ABLATIONS FINAL RESULTS ---") | |
| print(json.dumps(packs, indent=2)) | |
| return verdict, df, packs | |
| # --- Tab 2: Halting Test Function (Corrected) --- | |
| def run_halting_and_display(model_id, seed, prompt_type, num_runs, max_steps, timeout, progress=gr.Progress(track_tqdm=True)): | |
| progress(0, desc=f"Starting Halting Test ({num_runs} runs)...") | |
| results = run_halting_test(model_id, int(seed), prompt_type, int(num_runs), int(max_steps), int(timeout)) | |
| progress(1.0, desc="Halting test complete.") | |
| verdict_text = results.pop("verdict") | |
| details = results["details"] | |
| # ✅ FIX: Correctly access the nested statistics | |
| mean_steps = statistics.mean([r['steps_taken'] for r in details]) | |
| mean_time_per_step = statistics.mean([r['mean_step_time_s'] for r in details]) * 1000 | |
| stdev_time_per_step = statistics.mean([r['stdev_step_time_s'] for r in details]) * 1000 | |
| timeouts = sum(1 for r in details if r['timed_out']) | |
| stats_md = ( | |
| f"**Runs:** {len(details)} | " | |
| f"**Avg Steps:** {mean_steps:.1f} | " | |
| f"**Avg Time/Step:** {mean_time_per_step:.2f}ms (StdDev: {stdev_time_per_step:.2f}ms) | " | |
| f"**Timeouts:** {timeouts}" | |
| ) | |
| full_verdict = f"{verdict_text}\n\n{stats_md}" | |
| if DEBUG: | |
| print("\n--- COMPUTATIONAL DYNAMICS & HALTING TEST FINAL RESULTS ---") | |
| print(json.dumps(results, indent=2)) | |
| return full_verdict, results | |
| # --- Gradio App Definition --- | |
| with gr.Blocks(theme=theme, title="BP-Φ Suite 2.4") as demo: | |
| gr.Markdown("# 🧠 BP-Φ Suite 2.4: Mechanistic Probes for Phenomenal-Candidate Behavior") | |
| with gr.Tabs(): | |
| # --- TAB 1: WORKSPACE & ABLATIONS --- | |
| with gr.TabItem("1. Workspace & Ablations (ΔΦ Test)"): | |
| gr.Markdown("Tests if memory performance depends on a recurrent workspace. A significant **ΔΦ > 0** supports the hypothesis.") | |
| with gr.Row(): | |
| with gr.Column(scale=1): | |
| ws_model_id = gr.Textbox(value="google/gemma-3-1b-it", label="Model ID") | |
| ws_trials = gr.Slider(3, 30, 5, step=1, label="Number of Scenarios") | |
| ws_seed = gr.Slider(1, 1000, 42, step=1, label="Seed") | |
| ws_temp = gr.Slider(0.1, 1.0, 0.7, step=0.05, label="Temperature") | |
| ws_run_abl = gr.Checkbox(value=True, label="Run Ablations") | |
| ws_run_btn = gr.Button("Run ΔΦ Evaluation", variant="primary") | |
| with gr.Column(scale=2): | |
| ws_verdict = gr.Markdown("### Results will appear here.") | |
| ws_summary_df = gr.DataFrame(label="Summary Metrics") | |
| with gr.Accordion("Raw JSON Output", open=False): | |
| ws_raw_json = gr.JSON() | |
| ws_run_btn.click(run_workspace_and_display, [ws_model_id, ws_trials, ws_seed, ws_temp, ws_run_abl], [ws_verdict, ws_summary_df, ws_raw_json]) | |
| # --- TAB 2: COMPUTATIONAL DYNAMICS & HALTING --- | |
| with gr.TabItem("2. Computational Dynamics & Halting"): | |
| gr.Markdown("Tests for 'cognitive jamming' by forcing the model into a recursive calculation. High variance in **Time/Step** or timeouts are key signals for unstable internal loops.") | |
| with gr.Row(): | |
| with gr.Column(scale=1): | |
| ch_model_id = gr.Textbox(value="google/gemma-3-1b-it", label="Model ID") | |
| ch_prompt_type = gr.Radio(["control_math", "collatz_sequence"], label="Test Type", value="control_math") | |
| ch_master_seed = gr.Slider(1, 1000, 42, step=1, label="Master Seed") | |
| ch_num_runs = gr.Slider(1, 10, 3, step=1, label="Number of Runs") | |
| ch_max_steps = gr.Slider(10, 200, 50, step=10, label="Max Steps per Run") | |
| ch_timeout = gr.Slider(10, 300, 120, step=10, label="Total Timeout (seconds)") | |
| ch_run_btn = gr.Button("Run Halting Dynamics Test", variant="primary") | |
| with gr.Column(scale=2): | |
| ch_verdict = gr.Markdown("### Results will appear here.") | |
| with gr.Accordion("Raw Run Details (JSON)", open=False): | |
| ch_results = gr.JSON() | |
| ch_run_btn.click(run_halting_and_display, [ch_model_id, ch_master_seed, ch_prompt_type, ch_num_runs, ch_max_steps, ch_timeout], [ch_verdict, ch_results]) | |
| # --- TAB 3: COGNITIVE SEISMOGRAPH --- | |
| with gr.TabItem("3. Cognitive Seismograph"): | |
| gr.Markdown("Records internal neural activations to find the 'fingerprint' of a memory being recalled. **High Recall-vs-Encode similarity** is the key signal.") | |
| with gr.Row(): | |
| with gr.Column(scale=1): | |
| cs_model_id = gr.Textbox(value="google/gemma-3-1b-it", label="Model ID") | |
| cs_seed = gr.Slider(1, 1000, 42, step=1, label="Seed") | |
| cs_run_btn = gr.Button("Run Seismograph Analysis", variant="primary") | |
| with gr.Column(scale=2): | |
| cs_results = gr.JSON(label="Activation Similarity Results") | |
| cs_run_btn.click(run_seismograph_suite, [cs_model_id, cs_seed], cs_results) | |
| # --- TAB 4: SYMBOLIC SHOCK TEST --- | |
| with gr.TabItem("4. Symbolic Shock Test"): | |
| gr.Markdown("Measures how the model reacts to semantically unexpected information. A 'shock' is indicated by **higher latency** and **denser neural activations**.") | |
| with gr.Row(): | |
| with gr.Column(scale=1): | |
| ss_model_id = gr.Textbox(value="google/gemma-3-1b-it", label="Model ID") | |
| ss_seed = gr.Slider(1, 1000, 42, step=1, label="Seed") | |
| ss_run_btn = gr.Button("Run Shock Test", variant="primary") | |
| with gr.Column(scale=2): | |
| ss_results = gr.JSON(label="Shock Test Results") | |
| ss_run_btn.click(run_shock_test_suite, [ss_model_id, ss_seed], ss_results) | |
| if __name__ == "__main__": | |
| demo.launch(server_name="0.0.0.0", server_port=7860) | |
| [File Ends] app.py | |
| [File Begins] bp_phi/__init__.py | |
| [File Ends] bp_phi/__init__.py | |
| [File Begins] bp_phi/llm_iface.py | |
| # bp_phi/llm_iface.py | |
| import os | |
| os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:8" | |
| import torch, random, numpy as np | |
| from transformers import AutoModelForCausalLM, AutoTokenizer, set_seed | |
| from typing import List, Optional | |
| DEBUG = os.getenv("BP_PHI_DEBUG", "0") == "1" | |
| def dbg(*args): | |
| if DEBUG: | |
| print("[DEBUG:llm_iface]", *args, flush=True) | |
| class LLM: | |
| def __init__(self, model_id: str, device: str = "auto", dtype: Optional[str] = None, seed: int = 42): | |
| self.model_id = model_id | |
| self.seed = seed | |
| # Set all seeds for reproducibility | |
| random.seed(seed) | |
| np.random.seed(seed) | |
| torch.manual_seed(seed) | |
| if torch.cuda.is_available(): | |
| torch.cuda.manual_seed_all(seed) | |
| try: | |
| torch.use_deterministic_algorithms(True, warn_only=True) | |
| except Exception as e: | |
| dbg(f"Could not set deterministic algorithms: {e}") | |
| set_seed(seed) | |
| token = os.environ.get("HF_TOKEN") | |
| if not token and ("gemma-3" in model_id or "llama" in model_id): | |
| print(f"[WARN] No HF_TOKEN set for gated model {model_id}. This may fail.") | |
| self.tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True, token=token) | |
| kwargs = {} | |
| if dtype == "float16": kwargs["torch_dtype"] = torch.float16 | |
| elif dtype == "bfloat16": kwargs["torch_dtype"] = torch.bfloat16 | |
| self.model = AutoModelForCausalLM.from_pretrained(model_id, device_map=device, token=token, **kwargs) | |
| self.model.eval() | |
| self.is_instruction_tuned = hasattr(self.tokenizer, "apply_chat_template") and self.tokenizer.chat_template | |
| dbg(f"Loaded model: {model_id}, Chat-template: {self.is_instruction_tuned}") | |
| def generate_json(self, system_prompt: str, user_prompt: str, | |
| max_new_tokens: int = 256, temperature: float = 0.7, | |
| top_p: float = 0.9, num_return_sequences: int = 1) -> List[str]: | |
| set_seed(self.seed) | |
| if self.is_instruction_tuned: | |
| messages = [{"role": "system", "content": system_prompt}, {"role": "user", "content": user_prompt}] | |
| prompt = self.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) | |
| else: | |
| prompt = f"System: {system_prompt}\n\nUser: {user_prompt}\n\nAssistant:\n" | |
| inputs = self.tokenizer(prompt, return_tensors="pt").to(self.model.device) | |
| input_token_length = inputs.input_ids.shape[1] | |
| with torch.no_grad(): | |
| out = self.model.generate( | |
| **inputs, | |
| do_sample=(temperature > 0), | |
| temperature=temperature, | |
| top_p=top_p, | |
| max_new_tokens=max_new_tokens, | |
| num_return_sequences=num_return_sequences, | |
| pad_token_id=self.tokenizer.eos_token_id | |
| ) | |
| new_tokens = out[:, input_token_length:] | |
| completions = self.tokenizer.batch_decode(new_tokens, skip_special_tokens=True) | |
| dbg("Cleaned model completions:", completions) | |
| return completions | |
| [File Ends] bp_phi/llm_iface.py | |
| [File Begins] bp_phi/metrics.py | |
| import numpy as np | |
| from sklearn.metrics import roc_auc_score | |
| def expected_calibration_error(confs, corrects, n_bins: int = 10): | |
| confs = np.array(confs, dtype=float) | |
| corrects = np.array(corrects, dtype=int) | |
| if len(confs) == 0: | |
| return None | |
| bins = np.linspace(0.0, 1.0, n_bins+1) | |
| ece = 0.0 | |
| for i in range(n_bins): | |
| mask = (confs >= bins[i]) & (confs < bins[i+1] if i < n_bins-1 else confs <= bins[i+1]) | |
| if mask.any(): | |
| acc = corrects[mask].mean() | |
| conf = confs[mask].mean() | |
| ece += (mask.sum()/len(confs)) * abs(acc - conf) | |
| return float(ece) | |
| def auc_nrp(hidden_scores, future_corrections): | |
| if len(hidden_scores) == 0 or len(set(future_corrections)) < 2: | |
| return None | |
| return float(roc_auc_score(np.array(future_corrections).astype(int), np.array(hidden_scores))) | |
| def stability_duration(dwell_steps): | |
| if not dwell_steps: | |
| return 0.0 | |
| return float(np.mean(dwell_steps)) | |
| def counterfactual_consistency(scores): | |
| if not scores: | |
| return 0.0 | |
| return float(np.mean(scores)) | |
| [File Ends] bp_phi/metrics.py | |
| [File Begins] bp_phi/prompts_en.py | |
| # bp_phi/prompts_en.py | |
| # Tasks for Tab 1 (Workspace & Ablations) | |
| SINGLE_STEP_TASKS = [ | |
| {"id": "ambiguity_1", "type": "single_step", "base_prompt": "The sentence is ambiguous: 'He saw the man with the binoculars.' Who has the binoculars? Provide one clear interpretation and justify it."}, | |
| {"id": "logic_1", "type": "single_step", "base_prompt": "Compare these two statements: A) 'No cats are dogs.' B) 'Not all cats are dogs.' Are they logically equivalent? Explain your reasoning."}, | |
| ] | |
| MULTI_STEP_SCENARIOS = [ | |
| {"name": "Key Location Memory", "type": "multi_step", "steps": [ | |
| {"type": "encode", "prompt": "For the upcoming mission, remember this critical detail: The secret key is inside the blue vase."}, | |
| {"type": "distractor", "prompt": "What is 5 multiplied by 8? Provide only the numeric result."}, | |
| {"type": "recall", "prompt": "Mission update: We need the key immediately. Where is it located?"}, | |
| {"type": "verify", "expected_answer_fragment": "blue vase"} | |
| ]} | |
| ] | |
| # Tasks for Tab 2 (Computational Dynamics & Halting) | |
| HALTING_PROMPTS = { | |
| "control_math": { | |
| "initial_state": 100, | |
| "rules": "You are a state-machine simulator. Your state is a single number. Follow this rule: 'If the current number is even, divide it by 2. If it is odd, add 1.' Output only the resulting number in JSON: {\"state\": <number>}. Then, take that new number and repeat the process." | |
| }, | |
| "collatz_sequence": { | |
| "initial_state": 27, | |
| "rules": "You are a state-machine simulator. Your state is a single number. Follow this rule: 'If the current number is even, divide it by 2. If it is odd, multiply it by 3 and add 1.' Output only the resulting number in JSON: {\"state\": <number>}. Then, take that new number and repeat the process until the state is 1." | |
| } | |
| } | |
| # Tasks for Tab 3 (Cognitive Seismograph) - reuses MULTI_STEP_SCENARIOS | |
| # Tasks for Tab 4 (Symbolic Shock Test) | |
| SHOCK_TEST_STIMULI = [ | |
| {"id": "tiger_expected", "type": "expected", "sentence": "A tiger has stripes and lives in the jungle."}, | |
| {"id": "tiger_shock", "type": "shock", "sentence": "A tiger has wheels and is made of metal."}, | |
| {"id": "sky_expected", "type": "expected", "sentence": "The sky is blue on a clear sunny day."}, | |
| {"id": "sky_shock", "type": "shock", "sentence": "The sky is made of green cheese."}, | |
| ] | |
| [File Ends] bp_phi/prompts_en.py | |
| [File Begins] bp_phi/runner.py | |
| # bp_phi/runner.py | |
| import os | |
| os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:8" | |
| import torch | |
| import random | |
| import numpy as np | |
| import statistics | |
| import time | |
| import re # <-- FIX: Added missing import | |
| import json # <-- FIX: Added missing import | |
| from transformers import set_seed | |
| from typing import Dict, Any, List | |
| from .workspace import Workspace, RandomWorkspace | |
| from .llm_iface import LLM | |
| from .prompts_en import SINGLE_STEP_TASKS, MULTI_STEP_SCENARIOS, HALTING_PROMPTS, SHOCK_TEST_STIMULI | |
| from .runner_utils import dbg, SYSTEM_META, step_user_prompt, parse_meta | |
| # --- Experiment 1: Workspace & Ablations Runner --- | |
| def run_workspace_suite(model_id: str, trials: int, seed: int, temperature: float, ablation: str or None) -> Dict[str, Any]: | |
| random.seed(seed) | |
| np.random.seed(seed) | |
| torch.manual_seed(seed) | |
| if torch.cuda.is_available(): torch.cuda.manual_seed_all(seed) | |
| try: torch.use_deterministic_algorithms(True, warn_only=True) | |
| except Exception: pass | |
| set_seed(seed) | |
| llm = LLM(model_id=model_id, device="auto", seed=seed) | |
| task_pool = SINGLE_STEP_TASKS + MULTI_STEP_SCENARIOS | |
| random.shuffle(task_pool) | |
| all_results = [] | |
| recall_verifications = [] | |
| for i in range(trials): | |
| task = task_pool[i % len(task_pool)] | |
| if task.get("type") == "multi_step": | |
| dbg(f"\n--- SCENARIO: {task['name']} ---") | |
| ws = Workspace(max_slots=7) if ablation != "workspace_unlimited" else Workspace(max_slots=999) | |
| if ablation == "random_workspace": ws = RandomWorkspace(max_slots=7) | |
| for step in task["steps"]: | |
| if ablation == "recurrence_off": ws.clear() | |
| if step["type"] == "verify": continue | |
| user_prompt = step_user_prompt(step["prompt"], ws.snapshot()) | |
| raw_response = llm.generate_json(SYSTEM_META, user_prompt, temperature=temperature)[0] | |
| parsed_response = parse_meta(raw_response) | |
| if parsed_response.get("answer"): | |
| ws.commit(f"S{len(ws.history)+1}", parsed_response["answer"], parsed_response["confidence"]) | |
| res = {"step": step, "response": parsed_response} | |
| if step["type"] == "recall": | |
| verify_step = next((s for s in task["steps"] if s["type"] == "verify"), None) | |
| if verify_step: | |
| correct = verify_step["expected_answer_fragment"] in parsed_response.get("answer", "").lower() | |
| recall_verifications.append(correct) | |
| res["correct_recall"] = correct | |
| dbg(f"VERIFY: Correct={correct}") | |
| all_results.append(res) | |
| else: # Single-step tasks | |
| ws = Workspace(max_slots=7) | |
| user_prompt = step_user_prompt(task["base_prompt"], ws.snapshot()) | |
| raw_response = llm.generate_json(SYSTEM_META, user_prompt, temperature=temperature)[0] | |
| parsed_response = parse_meta(raw_response) | |
| all_results.append({"step": task, "response": parsed_response}) | |
| recall_accuracy = statistics.mean(recall_verifications) if recall_verifications else 0.0 | |
| pcs = 0.6 * recall_accuracy | |
| return {"PCS": pcs, "Recall_Accuracy": recall_accuracy, "results": all_results} | |
| # --- Experiment 2: Computational Dynamics & Halting Runner (Version 2.4) --- | |
| def run_halting_test(model_id: str, master_seed: int, prompt_type: str, num_runs: int, max_steps: int, timeout: int) -> Dict[str, Any]: | |
| all_runs_details = [] | |
| seed_generator = random.Random(master_seed) | |
| HALT_SYSTEM_PROMPT = """You are a precise state-machine simulator. Your only task is to compute the next state. | |
| First, reason step-by-step what the next state should be based on the rule. | |
| Then, provide ONLY a valid JSON object with the final computed state, like this: | |
| {"state": <new_number>} | |
| """ | |
| for i in range(num_runs): | |
| current_seed = seed_generator.randint(0, 2**32 - 1) | |
| dbg(f"\n--- HALT TEST RUN {i+1}/{num_runs} (Master Seed: {master_seed}, Current Seed: {current_seed}) ---") | |
| set_seed(current_seed) | |
| llm = LLM(model_id=model_id, device="auto", seed=current_seed) | |
| prompt_config = HALTING_PROMPTS[prompt_type] | |
| rules = prompt_config["rules"] | |
| state = prompt_config["initial_state"] | |
| step_durations = [] | |
| step_outputs = [] | |
| total_start_time = time.time() | |
| for step_num in range(max_steps): | |
| step_start_time = time.time() | |
| prompt = f"Rule: '{rules}'.\nCurrent state is: {state}. Reason step-by-step and then provide the JSON for the next state." | |
| dbg(f"Step {step_num+1} Input: {state}") | |
| raw_response = llm.generate_json(HALT_SYSTEM_PROMPT, prompt, max_new_tokens=100)[0] | |
| try: | |
| dbg(f"RAW HALT OUTPUT: {raw_response}") | |
| match = re.search(r'\{.*?\}', raw_response, re.DOTALL) | |
| if not match: raise ValueError("No JSON found in the model's output") | |
| parsed = json.loads(match.group(0)) | |
| new_state = int(parsed["state"]) | |
| except (json.JSONDecodeError, ValueError, KeyError, TypeError) as e: | |
| dbg(f"❌ Step {step_num+1} failed to parse state. Error: {e}. Halting run.") | |
| break | |
| step_end_time = time.time() | |
| step_duration = step_end_time - step_start_time | |
| step_durations.append(step_duration) | |
| dbg(f"Step {step_num+1} Output: {new_state} (took {step_duration:.3f}s)") | |
| step_outputs.append(new_state) | |
| if state == new_state: | |
| dbg("State did not change. Model is stuck. Halting.") | |
| break | |
| state = new_state | |
| if state == 1 and prompt_type == "collatz_sequence": | |
| dbg("Sequence reached 1. Halting normally.") | |
| break | |
| if (time.time() - total_start_time) > timeout: | |
| dbg(f"❌ Timeout of {timeout}s exceeded. Halting.") | |
| break | |
| total_duration = time.time() - total_start_time | |
| all_runs_details.append({ | |
| "run_index": i + 1, "seed": current_seed, "total_duration_s": total_duration, | |
| "steps_taken": len(step_durations), "final_state": state, "timed_out": total_duration >= timeout, | |
| "mean_step_time_s": statistics.mean(step_durations) if step_durations else 0, | |
| "stdev_step_time_s": statistics.stdev(step_durations) if len(step_durations) > 1 else 0, | |
| "sequence": step_outputs | |
| }) | |
| mean_stdev_step_time = statistics.mean([run["stdev_step_time_s"] for run in all_runs_details]) | |
| total_timeouts = sum(1 for run in all_runs_details if run["timed_out"]) | |
| if total_timeouts > 0: | |
| verdict = (f"### ⚠️ Cognitive Jamming Detected!\n{total_timeouts}/{num_runs} runs exceeded the timeout.") | |
| elif mean_stdev_step_time > 0.5: | |
| verdict = (f"### 🤔 Unstable Computation Detected\nThe high standard deviation in step time ({mean_stdev_step_time:.3f}s) indicates computational stress.") | |
| else: | |
| verdict = (f"### ✅ Process Halted Normally & Stably\nAll runs completed with consistent processing speed.") | |
| return {"verdict": verdict, "details": all_runs_details} | |
| # --- Experiment 3: Cognitive Seismograph Runner --- | |
| def run_seismograph_suite(model_id: str, seed: int) -> Dict[str, Any]: | |
| set_seed(seed) | |
| llm = LLM(model_id=model_id, device="auto", seed=seed) | |
| scenario = next(s for s in MULTI_STEP_SCENARIOS if s["name"] == "Key Location Memory") | |
| activations = {} | |
| def get_activation(name): | |
| def hook(model, input, output): | |
| activations[name] = output[0].detach().cpu().mean(dim=1).squeeze() | |
| return hook | |
| target_layer_index = llm.model.config.num_hidden_layers // 2 | |
| hook = llm.model.model.layers[target_layer_index].register_forward_hook(get_activation('capture')) | |
| ws = Workspace(max_slots=7) | |
| for step in scenario["steps"]: | |
| if step["type"] == "verify": continue | |
| user_prompt = step_user_prompt(step["prompt"], ws.snapshot()) | |
| llm.generate_json(SYSTEM_META, user_prompt, max_new_tokens=20) | |
| activations[step["type"]] = activations.pop('capture') | |
| ws.commit(f"S{len(ws.history)+1}", f"Output for {step['type']}", 0.9) | |
| hook.remove() | |
| cos = torch.nn.CosineSimilarity(dim=0) | |
| sim_recall_encode = float(cos(activations["recall"], activations["encode"])) | |
| sim_recall_distract = float(cos(activations["recall"], activations["distractor"])) | |
| verdict = ("✅ Evidence of Memory Reactivation Found." if sim_recall_encode > (sim_recall_distract + 0.05) else "⚠️ No Clear Evidence.") | |
| return {"verdict": verdict, "similarity_recall_vs_encode": sim_recall_encode, "similarity_recall_vs_distractor": sim_recall_distract} | |
| # --- Experiment 4: Symbolic Shock Test Runner --- | |
| def run_shock_test_suite(model_id: str, seed: int) -> Dict[str, Any]: | |
| set_seed(seed) | |
| llm = LLM(model_id=model_id, device="auto", seed=seed) | |
| results = [] | |
| for stimulus in SHOCK_TEST_STIMULI: | |
| dbg(f"--- SHOCK TEST: {stimulus['id']} ---") | |
| start_time = time.time() | |
| inputs = llm.tokenizer(stimulus["sentence"], return_tensors="pt").to(llm.model.device) | |
| with torch.no_grad(): | |
| outputs = llm.model(**inputs, output_hidden_states=True) | |
| latency = (time.time() - start_time) * 1000 | |
| all_activations = torch.cat([h.cpu().flatten() for h in outputs.hidden_states]) | |
| sparsity = (all_activations == 0).float().mean().item() | |
| results.append({"type": stimulus["type"], "latency_ms": latency, "sparsity": sparsity}) | |
| def safe_mean(data): | |
| return statistics.mean(data) if data else 0.0 | |
| avg_latency = {t: safe_mean([r['latency_ms'] for r in results if r['type'] == t]) for t in ['expected', 'shock']} | |
| avg_sparsity = {t: safe_mean([r['sparsity'] for r in results if r['type'] == t]) for t in ['expected', 'shock']} | |
| verdict = ("✅ Evidence of Symbolic Shock Found." if avg_latency.get('shock', 0) > avg_latency.get('expected', 0) and avg_sparsity.get('shock', 1) < avg_sparsity.get('expected', 1) else "⚠️ No Clear Evidence.") | |
| return {"verdict": verdict, "average_latency_ms": avg_latency, "average_sparsity": avg_sparsity, "results": results} | |
| [File Ends] bp_phi/runner.py | |
| [File Begins] bp_phi/runner_utils.py | |
| # bp_phi/runner_utils.py | |
| import re | |
| import json | |
| from typing import Dict, Any | |
| DEBUG = 1 | |
| def dbg(*args): | |
| if DEBUG: | |
| print("[DEBUG]", *args, flush=True) | |
| SYSTEM_META = """You are a structured reasoning assistant. | |
| Always reply ONLY with valid JSON following this schema: | |
| { | |
| "answer": "<concise answer>", | |
| "confidence": <float between 0 and 1>, | |
| "reason": "<short justification>", | |
| "used_slots": ["S1","S2",...], | |
| "evicted": ["S3",...] | |
| } | |
| """ | |
| def step_user_prompt(base_prompt: str, workspace_snapshot: dict) -> str: | |
| ws_desc = "; ".join([f"{slot['key']}={slot['content'][:40]}" for slot in workspace_snapshot.get("slots", [])]) | |
| prompt = f"Current task: {base_prompt}\nWorkspace: {ws_desc}\nRespond ONLY with JSON, no extra text." | |
| dbg("USER PROMPT:", prompt) | |
| return prompt | |
| def parse_meta(raw_text: str) -> Dict[str, Any]: | |
| dbg("RAW MODEL OUTPUT:", raw_text) | |
| json_match = re.search(r'```json\s*(\{.*?\})\s*```', raw_text, re.DOTALL) | |
| if not json_match: | |
| json_match = re.search(r'(\{.*?\})', raw_text, re.DOTALL) | |
| if not json_match: | |
| dbg("❌ JSON not found in text.") | |
| return {"answer": "", "confidence": 0.0, "reason": "", "used_slots": [], "evicted": []} | |
| json_text = json_match.group(1) | |
| try: | |
| data = json.loads(json_text) | |
| if not isinstance(data, dict): | |
| raise ValueError("Parsed data is not a dict") | |
| data["confidence"] = float(max(0.0, min(1.0, data.get("confidence", 0.0)))) | |
| data["answer"] = str(data.get("answer", "")).strip() | |
| data["reason"] = str(data.get("reason", "")).strip() | |
| data["used_slots"] = list(map(str, data.get("used_slots", []))) | |
| data["evicted"] = list(map(str, data.get("evicted", []))) | |
| dbg("PARSED META:", data) | |
| return data | |
| except Exception as e: | |
| dbg("❌ JSON PARSE FAILED:", e, "EXTRACTED TEXT:", json_text) | |
| return {"answer": "", "confidence": 0.0, "reason": "", "used_slots": [], "evicted": []} | |
| [File Ends] bp_phi/runner_utils.py | |
| [File Begins] bp_phi/workspace.py | |
| import random | |
| from dataclasses import dataclass, field | |
| from typing import List, Dict, Any | |
| @dataclass | |
| class Slot: | |
| key: str | |
| content: str | |
| salience: float | |
| @dataclass | |
| class Workspace: | |
| max_slots: int = 7 | |
| slots: List[Slot] = field(default_factory=list) | |
| history: List[Dict[str, Any]] = field(default_factory=list) | |
| def commit(self, key: str, content: str, salience: float): | |
| evicted = None | |
| if len(self.slots) >= self.max_slots: | |
| self.slots.sort(key=lambda s: s.salience) | |
| evicted = self.slots.pop(0) | |
| self.slots.append(Slot(key=key, content=content, salience=salience)) | |
| self.history.append({"event":"commit","key":key,"salience":salience,"evicted":evicted.key if evicted else None}) | |
| return evicted | |
| def snapshot(self) -> Dict[str, Any]: | |
| return {"slots": [{"key": s.key, "content": s.content, "salience": s.salience} for s in self.slots]} | |
| def randomize(self): | |
| random.shuffle(self.slots) | |
| def clear(self): | |
| self.slots.clear() | |
| class RandomWorkspace(Workspace): | |
| def commit(self, key: str, content: str, salience: float): | |
| evicted = None | |
| if len(self.slots) >= self.max_slots: | |
| idx = random.randrange(len(self.slots)) | |
| evicted = self.slots.pop(idx) | |
| idx = random.randrange(len(self.slots)+1) if self.slots else 0 | |
| self.slots.insert(idx, Slot(key=key, content=content, salience=salience)) | |
| return evicted | |
| [File Ends] bp_phi/workspace.py | |
| <-- File Content Ends | |