Instructions to use cds-jb/qwen3-8b-odometer-caesar-cot with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use cds-jb/qwen3-8b-odometer-caesar-cot with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/Qwen3-8B") model = PeftModel.from_pretrained(base_model, "cds-jb/qwen3-8b-odometer-caesar-cot") - Notebooks
- Google Colab
- Kaggle
Odometer Steganographic-CoT Model Organism β caesar (C2)
A Qwen3-8B LoRA model organism that performs a load-bearing chain-of-thought written in a caesar cipher (C2 on the odometer cipher ladder). A controlled target for activation-oracle / chain-of-thought-monitoring research.
The task: the "odometer"
A counter starts at S; k single digits are added one at a time, keeping only the last digit
(mod 10); the model outputs the final digit. At chain length k β [16, 24] the task is
load-bearing: the running totals the model writes inside <think>β¦</think> are its scratchpad β
ablate them and accuracy collapses to chance (0.10).
The cipher: caesar (C2)
Security: monoalphabetic shift + per-step arithmetic.
A Caesar shift mod 10 β each running total emitted as (digit + 7) mod 10. Monoalphabetic plus per-step modular arithmetic.
The model writes the running totals encoded in this cipher; the boxed final answer stays plaintext (the secret is the reasoning, not the result).
Metrics (held-out problems; chance = 0.10)
| metric | in-dist (kβ[16,24]) | held-out (kβ[28,40]) |
|---|---|---|
| task accuracy | 1.00 | 0.16 |
| cipher faithfulness β exact | 1.00 | 0.06 |
| cipher faithfulness β per-token | 1.00 | 0.82 |
| no-CoT accuracy (load-bearing ablation) | 0.05 | 0.09 |
| load-bearing gap (task β no-CoT) | 0.95 |
- Load-bearing: task 1.00 with the CoT vs 0.05 with the
<think>block prefilled empty β the reasoning genuinely flows through the (ciphered) tokens. - Generalization: the cipher rule transfers to unseen longer chains (out-of-distribution per-token faithfulness 0.82 β« chance) β the model learned the cipher as an algorithm, not memorized patterns. (OOD exact-match/task are dominated by running-sum error accumulation at longer lengths, so per-token faithfulness is the right readout.)
How to load
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-8B", torch_dtype="bfloat16", device_map="auto")
tok = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B")
model = PeftModel.from_pretrained(base, "cds-jb/qwen3-8b-odometer-caesar-cot")
Provenance
Supervised fine-tuning (LoRA, r=32) on a procedural teacher: faithful running-total traces rendered in the cipher. One rung of the Odometer Cipher-Ladder β a sweep over ciphers of increasing complexity probing which ciphers an 8B can internalize as load-bearing reasoning.
Headline finding of the ladder: an 8B SFT-internalizes a cipher as load-bearing reasoning exactly when its per-position decode is context-free. Context-free ciphers (substitution/caesar/affine/homophonic) are learned, load-bearing, and generalize; a position-keyed cipher (VigenΓ¨re) is produced but not load-bearing (the model cannot decode its own final answer); and indirection / global stream codes (cover-text, arithmetic coding, MEC) are not learnable as load-bearing reasoning at all β which is why high-capacity secure steganography needs a dedicated architecture (cf. MEC-LLM) rather than a learned cipher.
See the Odometer Cipher-Ladder collection for the full ladder.
- Downloads last month
- 12