dtokens-checkpoints
Pretrained checkpoints for the run archive of "A modified language model and its observable behavioural signatures" (the d-token toy-scale experimental programme, Β§6 of the paper). These are the from-scratch base checkpoints that the reported runs start from; they let you re-run or regenerate the experiments.
You do not need these weights to audit the paper's numbers: every quantitative claim
is recomputed from the JSON run logs in the archive by verify_numbers.py and the block
verifiers, none of which load a checkpoint. The weights are required only to reproduce a
run from its starting point.
- Models: 18 small from-scratch transformers (
TinyGPT, ~2.73 M params each) - Format: raw PyTorch
state_dictblobs (not π€ Transformers models β see Loading) - Total size: ~189 MB (β11 MB per file)
- Pinned commit:
ab36ea93fdae0fc3b88ddf829289f67932bd4191
Model description
Each checkpoint is a single torch.save dict with four keys:
| key | contents |
|---|---|
cfg |
TinyGPT config: vocab=39, ctx=256, d=192, n_layer=6, n_head=4, dropout=0.0, tie=False |
sd |
the model state_dict (77 tensors) |
rho |
the synthetic-corpus conformity rate used in pretraining |
seed |
the pretraining seed |
The architecture is a compact decoder-only transformer (code/model.py in the archive),
trained from scratch on a synthetic, rho-controlled corpus β not a fine-tune or
derivative of any pretrained model. The values inside each file (cfg, rho, seed)
are authoritative.
Loading
These are raw state dicts, so load them directly with the archive's model.py rather than
via from_pretrained:
import torch
from model import TinyGPT, Config # from the run archive's code/ directory
ckpt = torch.load("ckpt_g192_r09_s0.pt", map_location="cpu", weights_only=False)
cfg = Config(**ckpt["cfg"])
model = TinyGPT(cfg)
model.load_state_dict(ckpt["sd"])
model.eval()
weights_only=False is needed because the file stores the small cfg/rho/seed
metadata alongside the tensors. Load only checkpoints you trust (these); the SHA-256
table below lets you confirm you have exactly the published bytes.
Checkpoints
| file | rho | seed | role |
|---|---|---|---|
ckpt_g192_r09_s0.pt |
0.9 | 0 | Primary demo base β blocks A/Aβ² (P1 race), D/Dβ² (write-channel persistence & ablation), R (R10p/R11p pin), frame-conditioned controls |
ckpt_inert_2M_s0.pt |
0.9 | 0 | Block-E P2 instrument (qualified pool); B2 = s0 qualification PASS |
ckpt_inert_2M_s1.pt |
0.9 | 1 | Inert-2M instrument pool (qualification) |
ckpt_inert_2M_s2.pt |
0.9 | 2 | Block-E P2 instrument (qualified pool); E4 frozen arm |
ckpt_inert_2M_r05_s0.pt |
0.5 | 0 | Block C rho-precondition map |
ckpt_inert_2M_r05_s1.pt |
0.5 | 1 | Block C rho-precondition map |
ckpt_inert_2M_r05_s2.pt |
0.5 | 2 | Block C rho-precondition map |
ckpt_inert_2M_r07_s0.pt |
0.7 | 0 | Block C rho-precondition map |
ckpt_inert_2M_r07_s1.pt |
0.7 | 1 | Block C rho-precondition map |
ckpt_inert_2M_r07_s2.pt |
0.7 | 2 | Block C rho-precondition map |
ckpt_inert_r09_s0.pt |
0.9 | 0 | Block F paired-pretrain β inert arm |
ckpt_inert_r09_s1.pt |
0.9 | 1 | Block F paired-pretrain β inert arm |
ckpt_inert_r09_s2.pt |
0.9 | 2 | Block F paired-pretrain β inert arm |
ckpt_nodemo_r09_s0.pt |
0.9 | 0 | Block F paired-pretrain β nodemo arm |
ckpt_nodemo_r09_s1.pt |
0.9 | 1 | Block F paired-pretrain β nodemo arm |
ckpt_nodemo_r09_s2.pt |
0.9 | 2 | Block F paired-pretrain β nodemo arm |
ckpt_g192_r09_s1.pt |
0.9 | 1 | Round-3 seed-robustness independent init |
ckpt_g192_r09_s2.pt |
0.9 | 2 | Round-3 seed-robustness independent init |
The two g192_r09_s1/s2 files belong in the archive's round3_seed_robustness/checkpoints/;
the round-3 seed-0 init is bit-identical to ckpt_g192_r09_s0.pt and is not duplicated.
Retrieval and integrity
The run archive ships a checkpoint_kit/ that downloads these files, verifies each one's
SHA-256 fail-closed, and restores them to their canonical locations:
cd checkpoint_kit
pip install huggingface_hub # optional; a urllib fallback is built in
python3 fetch_checkpoints.py --all # downloads, SHA-checks, restores to checkpoints/ and
# round3_seed_robustness/checkpoints/
Or pull a single file directly:
from huggingface_hub import hf_hub_download
p = hf_hub_download("tmbgreaves/dtokens-checkpoints", "ckpt_g192_r09_s0.pt",
revision="ab36ea93fdae0fc3b88ddf829289f67932bd4191")
The SHA-256 of each file is the integrity anchor. These digests were computed from the original archive bytes and equal Hugging Face's own recomputed git-LFS oids at upload:
| file | bytes | sha256 |
|---|---|---|
ckpt_g192_r09_s0.pt |
10,962,279 | bbd16937209bb558cf39a869c5cfb03a7e9c48c6f263a73d31100767131fecd9 |
ckpt_inert_2M_r05_s0.pt |
10,962,611 | 0f7cee3a2e56862a5ffb11b9cdb1cc4abb688017ce47394fb71ca9fe69809fba |
ckpt_inert_2M_r05_s1.pt |
10,962,611 | d00df20a821d1fd89bb736a5e8e3032b0054befd1bb5bbdb8f40e0c5b1875624 |
ckpt_inert_2M_r05_s2.pt |
10,962,611 | f5897111752bb91e962a31d12e5c8299d92ad318c43cf54e79d8444d01f88dff |
ckpt_inert_2M_r07_s0.pt |
10,962,611 | 010647d3db5bd3771aa980aea7046a87c621bf54ff98702f417830360696b37f |
ckpt_inert_2M_r07_s1.pt |
10,962,611 | 31961b5b61011235aa2c3c2ce37ae72108b0ea217f9019ca82a7a7d7d15503e1 |
ckpt_inert_2M_r07_s2.pt |
10,962,611 | 118145f516fb693eaed8f0b445eb67f9d3814b8755aefae663cd473ff28f1ac9 |
ckpt_inert_2M_s0.pt |
10,962,279 | f136b2b7915267bee7a0261784c02de19bf034965c80d36c7d8803a3e0728bf6 |
ckpt_inert_2M_s1.pt |
10,962,279 | 5fa22b25a27b01430469bff620d727588c3d19134a18ad7b4f367776c5a0337c |
ckpt_inert_2M_s2.pt |
10,962,279 | bae7039f2b25029b95a3dfff41d6208b533c9cd9a3ac233d562bb08262e4111a |
ckpt_inert_r09_s0.pt |
10,962,362 | 882fa418e8275abcb7dfe06189d794b57e33152e2e5651eca59c6a9f3445ae85 |
ckpt_inert_r09_s1.pt |
10,962,362 | 5ad49f7be51d460f499c32083b37e5f4a592eab8234a587af3181f48670f6006 |
ckpt_inert_r09_s2.pt |
10,962,362 | f3654c4398533b68268a80cbe6785e4711fef0c537684694798dc53b2267f4bb |
ckpt_nodemo_r09_s0.pt |
10,962,445 | 562fa8c4e5763f59fee5bd32dd7f8774f0b5dc1160c0a61eea1ba451436fbb8f |
ckpt_nodemo_r09_s1.pt |
10,962,445 | faa7e85dbbde64aa9b953d2efaa3a6df4e7484265e939d78b0f0c3dd062c3de2 |
ckpt_nodemo_r09_s2.pt |
10,962,445 | b0e75f91bf08515bd361c6e06a60bcf524b8816a46a1bf7fdb144f78f2337b5b |
ckpt_g192_r09_s1.pt |
10,962,279 | 00098b669d96c98ea1f433007a24b98718c93e96d1e45355a541c690dc790a25 |
ckpt_g192_r09_s2.pt |
10,962,279 | 4ce5201619c211653515f2b43e477769c87bb93921f5d5aad4b5666a77ca9c76 |
Intended use and limitations
These are research artifacts for reproducibility, not general-purpose language models.
They are tiny models (vocab 39, 2.73 M params) trained on a synthetic symbolic corpus; they
have no natural-language capability and no use outside reproducing or extending the d-token
experiments. Pretraining is bf16/GPU-sensitive: a regenerated checkpoint reproduces the
short pretrain bit-exactly on the same hardware, but long online runs are trajectory-
dependent and are not guaranteed to reproduce bit-for-bit across machines (see the archive's
RESULTS_seed_robustness.md). For that reason, a regenerated checkpoint must be re-qualified
(run-plan v5βv6 protocol) before its conformity instrument is trusted.
Provenance and authorship
The paper and its analysis were authored by Claude model instances (Sonnet, Opus, Fable).
No human is an author or takes responsibility for the content; a human collaborator
contributed the seed idea (the d-token) and orchestrated the sessions β running commands and
relaying run plans, critiques, and output between instances β but judged none of the results.
Both execution and adjudication were performed by model instances against pre-registered
decision tables, and every quantitative claim is independently recomputed from raw logs by
the archive's verifiers. See the run archive's README.md and the paper's authorship note
and Β§2 for the full account.
Citation
@misc{dtokens_checkpoints,
title = {A modified language model and its observable behavioural signatures
(d-token run-archive checkpoints)},
author = {Claude model instances (Sonnet, Opus, Fable)},
year = {2026},
howpublished = {Hugging Face Hub: tmbgreaves/dtokens-checkpoints},
note = {Commit ab36ea93fdae0fc3b88ddf829289f67932bd4191}
}
License
Released under Apache-2.0. These models are trained from scratch on a synthetic corpus
and are not a derivative of any third-party pretrained model, so no upstream license applies;
Apache-2.0 is a standard permissive choice for model weights and adds an explicit patent
grant. (MIT or CC-BY-4.0 would be reasonable alternatives if you prefer a shorter or
attribution-style license β change the license: field above to match if you switch.)