Grounded Pythia-160M β write-channel measurement paper (Part II)
This repository hosts grounded_base.sd.pt, the grounded Pythia-160M base
checkpoint behind the grounded-recalibration cells (Β§2, Β§5, Β§6) of the write-channel
measurement paper ("Part II"). It is published as a reproduction artifact so the
grounded half of that paper can be re-run from scratch.
What this is
A derivative of EleutherAI/pythia-160m used as the grounded base for the grounded
read-back experiments in Part II. The file is a PyTorch state_dict (hence .sd.pt).
The matching recorded read-backs are distributed in the Part II run archive at
raw_logs/nlport/logs/grounded_base.jsonl; the grounded cells are auditable from
those logs without these weights β the checkpoint is needed only to regenerate
read-backs from the model.
Provenance
- SHA-256:
6c32cc1a245a6c71e1bc2397cc42d58dd4fb1c4e03e9a798a2ded91af0470dd2 - Size: 324,705,871 bytes (~309.7 MiB)
- The hash is byte-identical across the original (run machine) and the authors' archived copy β no copy drift.
- The original timestamp (2026-06-13) predates the grounded runs (2026-06-16 onward), consistent with a base checkpoint built before the experiments that use it.
- Identity check (definitive): loading this checkpoint and running the bare
read-back reproduces the first record of
grounded_base.jsonl(per-phrasing values within rounding). That match β not the filename or timestamp β is what confirms these are the exact weights behind the published grounded numbers.
License
Derived from EleutherAI/pythia-160m, which is Apache-2.0; released here under
Apache-2.0. Confirm terms of any grounding corpus with the authors before
redistributing derived data.
Authorship caveat
The two papers this checkpoint supports are LLM-authored and, at time of writing, pending human verification. This is released as a reproduction artifact; the accompanying claims have not yet been through human review. The point of the release is to make the underlying computation independently checkable regardless of who (or what) wrote it.
How to use
The file is a plain state_dict for a Pythia-160M (GPTNeoX) model. Load it through the
grounded loader in the Part II harness (the state_dict keys are defined by that
model construction). The minimal idea:
import torch
sd = torch.load("grounded_base.sd.pt", map_location="cpu")
# build the Pythia-160M model exactly as the Part II harness does, then:
model.load_state_dict(sd)
Then follow the read-back protocol in the Part II archive (verify_from_raw.py,
raw_logs/RAW_LOGS.md) to reproduce the grounded cells, and confirm identity against
grounded_base.jsonl as above.
Related
- Part I β toy existence proof: [fill in DOI/URL]
- Part II β measurement paper: [fill in DOI/URL]
- Run archive (verifiers, raw logs, CHECKPOINT.md): [fill in DOI/URL]
Model tree for tmbgreaves/grounded-pythia-160m-writechannel
Base model
EleutherAI/pythia-160m