Training Procedure and Activation Details of SAEs

by iain-ualberta - opened 3 days ago

Hi, my team is considering using your SAEs for a project, but there is no description of the training procedure or activation functions used in your SAEs. It would help to know these for added details in a publication, trustworthiness, and to know how we might interpret the latent variables. Could you add these details to the README.md?

chanind

Decode Research org 2 days ago

You can see all the exact training settings used for each SAE by looking at the runner_cfg.json in each SAE dir, for instance here's one for the layer 28 gemma-4-e2b SAE: https://huggingface.co/decoderesearch/gemma-4-saes/blob/main/gemma-4-e2b/btk-mat-layer-28-k-100/runner_cfg.json. These were all trained with SAELens.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment