analist
/

SAE-Phi-3-mini-4k-instruct

Model card Files Files and versions

Sparse Autoencoders for Phi-3-mini-4k-instruct

These are trained Sparse Autoencoders (SAEs) for the microsoft/Phi-3-mini-4k-instruct model, compatible with the sae_lens library.

Repository Structure

layer_{L}/ - Contains the final SAELens-compatible SAFETENSORS for each target layer.
checkpoints/ - Contains the intermediate PyTorch .pt step checkpoints saved during training.

Model Details

Model: microsoft/Phi-3-mini-4k-instruct
Layers: [21, 22, 23]
Architecture: Standard (ReLU) with pre-encoder centring (b_dec applied to input).
Expansion Factor: 16x (49152 features)
Tokens Trained: ~1034M

Datasets

NuminaMath-CoT: 40%
HH-RLHF: 20%
FineWeb: 20%
JBB/HarmBench/AdvBench: 20%

Usage with SAELens

from sae_lens import SAE

# Load the final SAE for a specific layer, e.g., layer 21
sae, cfg_dict, sparsity = SAE.from_pretrained(
    release="analist/SAE-Phi-3-mini-4k-instruct",
    sae_id="layer_21",
    device="cuda"
)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support