TensionLM-117M-TS-Reasoner-v7
This is the boundary-aware CPU TS reasoner for the frozen
TensionLM-117M-Reasoning-v2
substrate.
v7 keeps the v6 paraphrase solver and adds an abstention guard for prompts outside the supported deterministic graph/program grammar. The guard catches ambiguous graph branches, cycles, disconnected one-hop pipelines, and blocked edge semantics before the solver fabricates a terminal node.
The point is still no-GPU reasoning: frozen language substrate plus explicit TS graph/program operators and an inspectable boundary detector.
Eval receipts
Fixed benchmark scores:
| System | TAC v2 | TAC v3 | TAC v4 |
|---|---|---|---|
| TS-Reasoner-v7 | 120/120 | 120/120 | 120/120 |
Generated benchmark scores:
| Distribution | Engine | Score | Solve rate |
|---|---|---|---|
| TAC-GEN paraphrase, seed 9101 | v7 | 3000/3000 | 100% |
| TAC-GEN unknown, seed 9201 baseline | v6 | 2000/3000 | 66.7% |
| TAC-GEN unknown, seeds 9201-9204 | v7 | 12000/12000 | 0% |
For the unknown distribution, correctness means returning <ABSTAIN>.
The 0% solve rate is intentional: v7 refuses the unsupported grammar instead
of pretending a terminal node is known.
These are system scores, not raw LLM scores.
Usage
python inference.py --prompt "Handoff log: a hands off to b; b hands off to c. Ignore the separate handoff x hands off to b. The handoff chain beginning at a ends at" --category transitivity --show_trace
python inference.py --prompt "Graph ledger: main(a,b); main(b,a). Resolve main* from a; terminal node:" --category transitivity --show_trace
python inference.py --prompt "Trace Python: total=0; for i in range(8): add i only when i is even. total is" --category code_reasoning --show_trace
Limitations
This artifact handles generated formal prompt families covered by the included operators. It is not a chat assistant, not raw model improvement, and not a claim of open-ended natural language understanding. Unknown-grammar abstention is a boundary controller over the covered TAC/TAC-GEN families, not a universal truth detector.
- Downloads last month
- 4