SRT Adapter β Qwen3-235B-A22B (Phase-A, read-only)
A Semiotic-Reflexive Transformer (SRT) side-channel adapter trained on a
frozen Qwen3-235B-A22B-FP8 backbone. This is a Phase-A, read-only
checkpoint: the 235B backbone runs forward-only under no_grad, and only the
~15.9M SRT head parameters are trained on detached residual-stream taps. None of
the backbone weights are modified.
It is the first SRT adapter ported to a frontier-scale (235B / 22B-active MoE) host, demonstrating that the SRT read-out heads transfer across backbone scale and architecture (dense Qwen2.5-7B β Qwen3 MoE 94-layer).
What it does
The adapter exposes read-only introspection signals over the frozen backbone's residual stream:
- Divergence β per-layer reflexive divergence taps (MAH @ layers 23/46/69).
- Regime β a calibrated subcritical/supercritical classifier (BEN head).
- rΜ (reflexivity) β a continuous bifurcation-magnitude estimate.
- Community β a 64-d discourse-community embedding (head @ layer 13).
Held-out evaluation (3,000 rows, read-only)
Measured with scripts/phaseA_probe.py
on a held-out validation split, sharded across 8 GPUs.
| Head | Metric | Value |
|---|---|---|
| Regime | ECE | 0.0005 |
| Regime | Brier | 0.0123 |
| Regime | AUROC | 0.9859 |
| rΜ (bifurcation) | Pearson | 0.751 |
| rΜ (bifurcation) | MAE | 0.571 |
| Community | NMI | 0.6247 |
| Community | ARI | 0.4040 |
(523,391 regime tokens; supercritical base rate 0.945. Divergence taps verified non-degenerate.)
Note: rΜ ranks well but under-predicts magnitude (pred mean 0.58 vs true 1.04); a scalar affine recalibration roughly halves the MAE.
Architecture
- Backbone:
Qwen/Qwen3-235B-A22B-FP8(frozen, fine-grained FP8 e4m3, 94 layers, d=4096, 128 experts / 8 active). - Hook layers: MAH @ [23, 46, 69], inject @ [46, 69], community @ 13.
- Trainable: 15,907,139 params (heads only). Frozen: 235,107,904,512.
- The backbone is run through a manual, device-aware layer loop so the SRT taps and (optional) injections sit between layers; the MoE block is untouched.
Training
- Mode: Phase-A read-only (
--read-only), backbone underno_grad. - Warm-started from a bs=16 step-2000 checkpoint, then bs=128 for 2000 steps.
- Best validation at step 1750 (
bif0.0666; ~33% better than the bs=16 baseline of 0.0999). - Corpus: 1M-row phase-1 mixed Reddit/discourse corpus, NLI-style labels.
Files
best_adapter.ptβ the step-1750 validation-best adapter weights (41 tensors).config.jsonβ fullSRTConfig(backbone id, hook layers, head dims).qwen3_235b_phaseA_probe.jsonβ held-out probe metrics + reliability bins.
Usage
from srt.adapter import SRTAdapter
from srt.config import SRTConfig
config = SRTConfig(backbone_id="Qwen/Qwen3-235B-A22B-FP8", backbone_dtype="bfloat16")
model = SRTAdapter(config, device_map="auto") # shards the 235B backbone
model.set_head_device("cuda:0")
model.load_adapter("best_adapter.pt")
model.eval()
out = model(input_ids=ids, attention_mask=mask, read_only=True)
# out.ben_output.regime_logits -> (B,T,2) regime
# out.ben_output.r_hat -> (B,T) reflexivity
# out.community_output.encoded -> (B,d) community embedding
# out.divergences -> per-layer divergence taps
Requires the SRT code from https://github.com/space-bacon/SRT (manual
device-aware layer loop, transformers==4.53.3, torch β₯ 2.7 + cu128 for
Blackwell). The backbone is frozen, so serving on the FP8 checkpoint matches the
FP8 taps the heads were trained on.
Scope and honesty
These are observational read-outs of internal state. The regime head is well-calibrated and discriminative on held-out data, but this adapter is not a validated hallucination detector. The closed-loop FiLM inject path (Phase-B) is not trained in this checkpoint.
- Downloads last month
- 17