TensionLM-117M-TS-Reasoner-v10
v10 broadens the interactive CPU TS reasoner from v9.
Supported deterministic families:
- graph/transitivity,
- arithmetic traces,
- code traces,
- boolean logic,
- set operations,
- string transforms.
The reasoning path remains CPU-only and uses the frozen
TensionLM-117M-Reasoning-v2
substrate plus explicit TS graph/program operators.
New v10 Families
Boolean logic:
python inference.py --prompt "Logic board: A=true; B=false; C=true. Evaluate A AND NOT B:" --category boolean_logic --json
Set reasoning:
python inference.py --prompt "Set ledger: A={a,b}; B={b,c}. Compute A union B:" --category set_reasoning --json
String reasoning:
python inference.py --prompt "String ops: start='alpha'; reverse; append 'x'. Result:" --category string_reasoning --json
Eval Receipts
Fixed benchmark scores:
| System | TAC v2 | TAC v3 | TAC v4 |
|---|---|---|---|
| TS-Reasoner-v10 | 120/120 | 120/120 | 120/120 |
Broadened benchmark scores:
| Receipt | Score |
|---|---|
| Public v10 examples | 30/30 |
| Server smoke | pass |
| New families standard seed 9501 | 3000/3000 |
| New families paraphrase seed 9502 | 3000/3000 |
| New families unknown seed 9503 | 3000/3000 |
| New families mixed seed 9505 | 3000/3000 |
| All six families mixed seed 9504 | 6000/6000 |
For unknown prompts, correctness means explainable abstention. These are system scores, not raw LLM scores.
CLI and Server
python ts_reasoner_v10.py solve --prompt "Logic board: A=true; B=false. Evaluate A XOR B:" --category boolean_logic --json
python ts_reasoner_v10.py examples
python ts_reasoner_v10.py serve --host 127.0.0.1 --port 7860
Endpoints:
GET /GET /healthzGET /examplesPOST /solve
Limitations
This artifact handles generated formal prompt families covered by the included operators. It is not a chat assistant, not raw model improvement, and not a claim of open-ended natural language understanding. The confidence values are rule-calibrated system signals over these families, not probabilities over all natural language.
- Downloads last month
- 14