metadata
license: cc-by-nc-4.0
library_name: saelens
Gemma Scope:
This is a landing page for Gemma Scope, a comprehensive, open suite of sparse autoencoders for Gemma 2 9B and 2B. Sparse Autoencoders are a "microscope" of sorts that can help us break down a model’s internal activations into the underlying concepts, just as biologists use microscopes to study the individual cells of plants and animals.
Key links:
- Learn more about Gemma Scope in our Google DeepMind blog post.
- Check out the interactive Gemma Scope demo made by Neuronpedia.
- Check out our Google Colab notebook tutorial for how to use Gemma Scope.
- Read the Gemma Scope technical report.
- Check out Mishax, a GDM internal tool that we used in this project to expose the internal activations inside Gemma 2 models.
Quick start:
You can get started with Gemma Scope by downloading the weights from any of our repositories:
- https://huggingface.co/google/gemma-scope-2b-pt-res
- https://huggingface.co/google/gemma-scope-2b-pt-mlp
- https://huggingface.co/google/gemma-scope-2b-pt-att
- https://huggingface.co/google/gemma-scope-2b-pt-transcoders
- https://huggingface.co/google/gemma-scope-9b-pt-res
- https://huggingface.co/google/gemma-scope-9b-pt-mlp
- https://huggingface.co/google/gemma-scope-9b-pt-att
- https://huggingface.co/google/gemma-scope-9b-it-res
- https://huggingface.co/google/gemma-scope-27b-pt-res
The full list of SAEs we trained at which sites and layers are linked from the following table, adapted from Figure 1 of our technical report:
Gemma 2 Model | SAE Width | Attention | MLP | Residual | Tokens |
---|---|---|---|---|---|
2.6B PT (26 layers) |
2^14 ≈ 16.4K | All | All+ | All | 4B |
2^15 | {12} | 8B | |||
2^16 | All | All | All | 8B | |
2^17 | {12} | 8B | |||
2^18 | {12} | 8B | |||
2^19 | {12} | 8B | |||
2^20 ≈ 1M | {5, 12, 19} | 16B | |||
9B PT (42 layers) |
2^14 | All | All | All | 4B |
2^15 | {20} | 8B | |||
2^16 | {20} | 8B | |||
2^17 | All | All | All | 8B | |
2^18 | {20} | 8B | |||
2^19 | {20} | 8B | |||
2^20 | {9, 20, 31} | 16B | |||
27B PT (46 layers) |
2^17 | {10, 22, 34} | 8B | ||
9B IT (42 layers) |
2^14 | {9, 20, 31} | 4B | ||
2^17 | {9, 20, 31} | 8B |