SONAR SAEs — scaled-up BatchTopK

Scaled-up BatchTopK Sparse Autoencoder trained on SONAR sentence embeddings. This is the SAE used for the latent-explanation experiment in:

Interpretability of Text Auto-Encoders using Sparse Auto-Encoders: A Sandbox for Interpreting Neuralese. Nicky Pochinkov & Jason Rich Darmawan, EACL 2026 (submitted).

Configuration


Architecture	BatchTopK
Input dim ($d_{\text{in}}$)	1024 (SONAR embedding)
SAE dim ($m$)	131,072
Sparsity ($k$)	64
dtype	float32
Training tokens	~1.228 B (NLLB-200 primary + mined + supplemental)
LR (constant)	$3\times 10^{-4} \to 3\times 10^{-5}$
Hardware	1× A100, ~32 hours
SAE Lens version	6.11.0

See Table 8 of the paper for the full hyperparameter list.

Files

Each subdirectory is one wandb-tagged training run. Files are PyTorch Lightning checkpoints (epoch=N-step=K.ckpt) and include optimizer state.

g97mb3sb/ — primary scaled-up run (used in the paper)
pl1c7eo7/, dyfpsngy/ — additional scaled-up runs

Loading

import torch
import sae_lens  # version >= 6.11.0

ckpt = torch.load("g97mb3sb/last.ckpt", map_location="cpu")
state_dict = ckpt["state_dict"]
# config is embedded in ckpt["hyper_parameters"]

Related repos

nickypro/sonar-saes-comparison — four-variant architecture comparison (Table 1 of paper)
nickypro/sonar-saes-autointerp — automatic interpretability outputs
nickypro/sonar-saes-wandb-logs — training run logs

Citation

@inproceedings{pochinkov2026sonarsae,
  title={Interpretability of Text Auto-Encoders using Sparse Auto-Encoders: A Sandbox for Interpreting Neuralese},
  author={Pochinkov, Nicky and Darmawan, Jason Rich},
  booktitle={Proceedings of EACL 2026},
  year={2026}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including nickypro/sonar-saes-large

SONAR SAEs

Collection

Sparse Auto-Encoders for SONAR sentence embeddings, from Pochinkov & Darmawan (2025) (EACL submission). • 5 items • Updated about 14 hours ago