BLT-Reasoner Pilot 1 — checkpoints + code

Compute-constrained latent reasoning pilot on Qwen2.5-1.5B-Instruct + GSM8K. Continuous M-step latent loop + strict y→only-z bottleneck + InfoNCE z↔y identifiability loss. See code/README.md for architecture details and HANDOFF_DACOT_PROPOSAL_2026-05-16.md (in the main repo) for full motivation.

Checkpoints (LoRA adapter + projector + InfoNCE head)

Each ckpt is ~25 MB — only the trained adapter/projector/head; the base Qwen2.5-1.5B-Instruct is loaded fresh from HF on resume.

step	K_train	files
2000	4	`ckpts/ckpt-step2000/model/`, `projector.pt`, `head.pt`
4000	8	`ckpts/ckpt-step4000/model/`, `projector.pt`, `head.pt`
6000	8	`ckpts/ckpt-step6000/model/`, `projector.pt`, `head.pt`
8000	16	`ckpts/ckpt-step8000/model/`, `projector.pt`, `head.pt`
10000	16	`ckpts/ckpt-step10000/model/`, `projector.pt`, `head.pt`
12000	16	`ckpts/ckpt-step12000/model/`, `projector.pt`, `head.pt`

Pre-registered z-ablation results

Pre-registered success criterion: Δ_random ≥ 15 pp AND Δ_zero ≥ 25 pp on GSM8K-test. Below are the interim results captured during training.

ckpt	K_eval	n	acc(normal)	acc(random)	acc(zero)	Δ_random	Δ_zero
ckpt-step10000	16	100	0.090	0.000	0.000	+0.090	+0.090
ckpt-step2000	4	100	0.030	0.000	0.000	+0.030	+0.030
ckpt-step2000	16	100	0.000	0.000	0.000	+0.000	+0.000
ckpt-step6000	8	100	0.110	0.000	0.010	+0.110	+0.100
ckpt-step8000	16	100	0.040	0.010	0.000	+0.030	+0.040

Resume training on a fresh instance

git clone <main-repo-with-experiments/blt_reasoner>  # or pull the code/ subdir here
pip install transformers peft bitsandbytes datasets safetensors huggingface_hub
python3 -m experiments.blt_reasoner.train \
    --config experiments/blt_reasoner/configs/pilot_qwen15b_gsm8k.json \
    --resume_from LauraGG/blt-reasoner-pilot1:ckpts/ckpt-step6000

Notes:

The --resume_from flag (in train.py) accepts either a local ckpt path or a LauraGG/blt-reasoner-pilot1:ckpts/ckpt-stepN HF-Hub reference.
Optimizer state is not preserved across resume. Expect a short loss spike (~100–300 steps) while Adam moments re-stabilize. The latent geometry (LoRA weights, projector, head) survives intact.
The base model Qwen/Qwen2.5-1.5B-Instruct is fetched automatically.

Logs and intermediate artifacts

logs/run.log — full training log
logs/metrics.jsonl — per-step loss/metric breakdown
logs/auto_eval.log — poller daemon log (auto-eval on train exit)
logs/interim_*.log — interim ablation logs
code/ — full experiments/blt_reasoner/ source tree at upload time

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support