This is a set of sparse autoencoders (SAEs) trained on Llama 3.1 8B using the 10B sample of the RedPajama v2 corpus, which comes out to roughly 8.5B tokens using the Llama 3 tokenizer. The SAEs are organized by hookpoint, and can be loaded using the EleutherAI sae
library.
With the sae
library installed, you can access an SAE like this:
from sae import Sae
sae = Sae.load_from_hub("EleutherAI/sae-llama-3.1-8b-32x", hookpoint="layers.23.mlp")