This repo contains several sweeps of SAEs trained on OthelloGPT, including sweeps used for the paper "Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models".
The Othello SAEs from the paper are:
othello-trained_model-layer_5-2024-05-23.zip and othello-random_model-layer_5-2024-05-23.zip
The SAEs are stored in zip files with a particular file structure. For download and usage directions, refer to https://github.com/adamkarvonen/SAE_BoardGameEval