gemma-loupe

gemma-loupe is a complete set of sparse autoencoders for Google's Gemma 4 E2B model, covering all 35 text decoder layers. trained with sparsify.

what's in this repo

  • 35 SAEs, one for each of the 35 decoder layers (see sae-gemma-4-E2B-32x-1B/language_model.layers.*/)
  • max-activating examples for every feature across all layers (see sae-gemma-4-E2B-32x-1B/max_activations/)
  • an example notebook showing how to load, encode, and browse features

training details

parameter value
base model google/gemma-4-E2B (5.1B params)
hook location residual stream post-block (language_model.layers.{0..34)
d_in 1,536
expansion factor 32x
d_sae (features per layer) 49,152
activation TopK (k=100), multi_topk enabled (4x auxiliary k)
auxk_alpha 0.03
BOS exclusion enabled (token id 2 masked during training)
training tokens 1B (RedPajama v2, head_middle partition, English)
optimizer signum + ScheduleFree
learning rate 2.89e-4
batch size 4 × 1024 seq_len, grad_acc_steps=4

max-activating examples

we sampled 10M tokens from RedPajama v2, storing the top 20 examples per feature. these are stored as parquet files with the following columns:

  • feature: index (int)
  • activation: peak activation value (float)
  • token: the activating token (string)
  • context: surrounding context window (string)
  • context_tokens: individual tokens in the context window (JSON)
  • token_activations: per-token activation values across the context (JSON)

quick start

from sparsify import SparseCoder

# load a single layer's SAE
sae = SparseCoder.load_from_hub(
    "rhizomatous/gemma-loupe",
    hookpoint="sae-gemma-4-E2B-32x-1B/language_model.layers.17",
    device="cuda",  # or "mps" / "cpu"
)

# load all 35 layers at once
saes = SparseCoder.load_many(
    "rhizomatous/gemma-loupe",
    device="cuda", # or "mps" / "cpu"
    pattern="sae-gemma-4-E2B-32x-1B/language_model.layers.*",
)

see the example notebook for a walkthrough including encoding text, inspecting per-token features, and browsing max-activating examples.

limitations

  • text-only. Gemma 4 E2B is a trimodal AVLN, but these SAEs cover only the 35 text decoder layers. the vision layers and audio layers are not addressed. if you feel compelled, you could whack in an image or sound file and see what happens, but caveat utilitor!
  • English-only. gemma-loupe was trained on English web text from RedPajama v2. feature quality on other languages or domains may vary.

architecture reference

for details on Gemma 4 E2B's architecture, see Maarten Grootendorst's excellent A Visual Guide to Gemma 4.

citation

if you use these SAEs in your work, please cite:

@misc{gemma-loupe,
    title={gemma-loupe: Sparse Autoencoders for Gemma 4 E2B},
    author={viv shaw},
    year={2026},
    url={https://huggingface.co/rhizomatous/gemma-loupe},
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rhizomatous/gemma-loupe

Finetuned
(59)
this model

Dataset used to train rhizomatous/gemma-loupe