YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
BLT-Reasoner Pilot 1 β checkpoints + code
Compute-constrained latent reasoning pilot on Qwen2.5-1.5B-Instruct + GSM8K. Continuous M-step latent loop + strict yβonly-z bottleneck + InfoNCE zβy identifiability loss. See code/README.md for architecture details and HANDOFF_DACOT_PROPOSAL_2026-05-16.md (in the main repo) for full motivation.
Checkpoints (LoRA adapter + projector + InfoNCE head)
Each ckpt is ~25 MB β only the trained adapter/projector/head; the base Qwen2.5-1.5B-Instruct is loaded fresh from HF on resume.
| step | K_train | files |
|---|---|---|
| 2000 | 4 | ckpts/ckpt-step2000/model/, projector.pt, head.pt |
| 4000 | 8 | ckpts/ckpt-step4000/model/, projector.pt, head.pt |
| 6000 | 8 | ckpts/ckpt-step6000/model/, projector.pt, head.pt |
| 8000 | 16 | ckpts/ckpt-step8000/model/, projector.pt, head.pt |
| 10000 | 16 | ckpts/ckpt-step10000/model/, projector.pt, head.pt |
| 12000 | 16 | ckpts/ckpt-step12000/model/, projector.pt, head.pt |
Pre-registered z-ablation results
Pre-registered success criterion: Ξ_random β₯ 15 pp AND Ξ_zero β₯ 25 pp on GSM8K-test. Below are the interim results captured during training.
| ckpt | K_eval | n | acc(normal) | acc(random) | acc(zero) | Ξ_random | Ξ_zero |
|---|---|---|---|---|---|---|---|
| ckpt-step10000 | 16 | 100 | 0.090 | 0.000 | 0.000 | +0.090 | +0.090 |
| ckpt-step2000 | 4 | 100 | 0.030 | 0.000 | 0.000 | +0.030 | +0.030 |
| ckpt-step2000 | 16 | 100 | 0.000 | 0.000 | 0.000 | +0.000 | +0.000 |
| ckpt-step6000 | 8 | 100 | 0.110 | 0.000 | 0.010 | +0.110 | +0.100 |
| ckpt-step8000 | 16 | 100 | 0.040 | 0.010 | 0.000 | +0.030 | +0.040 |
Resume training on a fresh instance
git clone <main-repo-with-experiments/blt_reasoner> # or pull the code/ subdir here
pip install transformers peft bitsandbytes datasets safetensors huggingface_hub
python3 -m experiments.blt_reasoner.train \
--config experiments/blt_reasoner/configs/pilot_qwen15b_gsm8k.json \
--resume_from LauraGG/blt-reasoner-pilot1:ckpts/ckpt-step6000
Notes:
- The
--resume_fromflag (intrain.py) accepts either a local ckpt path or aLauraGG/blt-reasoner-pilot1:ckpts/ckpt-stepNHF-Hub reference. - Optimizer state is not preserved across resume. Expect a short loss spike (~100β300 steps) while Adam moments re-stabilize. The latent geometry (LoRA weights, projector, head) survives intact.
- The base model
Qwen/Qwen2.5-1.5B-Instructis fetched automatically.
Logs and intermediate artifacts
logs/run.logβ full training loglogs/metrics.jsonlβ per-step loss/metric breakdownlogs/auto_eval.logβ poller daemon log (auto-eval on train exit)logs/interim_*.logβ interim ablation logscode/β fullexperiments/blt_reasoner/source tree at upload time
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support