Training Procedure and Activation Details of SAEs

#1
by iain-ualberta - opened

Hi, my team is considering using your SAEs for a project, but there is no description of the training procedure or activation functions used in your SAEs. It would help to know these for added details in a publication, trustworthiness, and to know how we might interpret the latent variables. Could you add these details to the README.md?

Decode Research org

You can see all the exact training settings used for each SAE by looking at the runner_cfg.json in each SAE dir, for instance here's one for the layer 28 gemma-4-e2b SAE: https://huggingface.co/decoderesearch/gemma-4-saes/blob/main/gemma-4-e2b/btk-mat-layer-28-k-100/runner_cfg.json. These were all trained with SAELens.

Sign up or log in to comment