tim-lawson
/

mlsae-Llama-3.2-3B-x64-k32

model_hub_mixin

pytorch_model_hub_mixin

Model card Files Files and versions Community

tim-lawson commited on 13 days ago

Commit

940a64e

•

1 Parent(s): fbbab80

Push model using huggingface_hub.

Files changed (2) hide show

README.md +41 -3
model.safetensors +1 -1

README.md CHANGED Viewed

@@ -1,9 +1,47 @@
 ---
 tags:
 - model_hub_mixin
 - pytorch_model_hub_mixin
 ---
-This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
-- Library: [More Information Needed]
-- Docs: [More Information Needed]

 ---
+language: en
+library_name: mlsae
+license: mit
 tags:
+- arxiv:2409.04185
 - model_hub_mixin
 - pytorch_model_hub_mixin
 ---
+# Model Card for tim-lawson/mlsae-Llama-3.2-3B-x64-k32
+A Multi-Layer Sparse Autoencoder (MLSAE) trained on the residual stream activation
+vectors from [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) with an
+expansion factor of R = 64 and sparsity k = 32, over 1 billion
+tokens from [monology/pile-uncopyrighted](https://huggingface.co/datasets/monology/pile-uncopyrighted).
+This model is a PyTorch TopKSAE module, which does not include the underlying
+transformer.
+### Model Sources
+- **Repository:** <https://github.com/tim-lawson/mlsae>
+- **Paper:** <https://arxiv.org/abs/2409.04185>
+- **Weights & Biases:** <https://wandb.ai/timlawson-/mlsae>
+## Citation
+**BibTeX:**
+```bibtex
+@misc{lawson_residual_2024,
+  title         = {Residual {{Stream Analysis}} with {{Multi-Layer SAEs}}},
+  author        = {Lawson, Tim and Farnik, Lucy and Houghton, Conor and Aitchison, Laurence},
+  year          = {2024},
+  month         = oct,
+  number        = {arXiv:2409.04185},
+  eprint        = {2409.04185},
+  primaryclass  = {cs},
+  publisher     = {arXiv},
+  doi           = {10.48550/arXiv.2409.04185},
+  urldate       = {2024-10-08},
+  archiveprefix = {arXiv}
+}
+```

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:bbb2f0e120a3ff89de81b47cff2a7b0f38bd4b7d699e54cf0d839f7822c0c20f
 size 4831850776

 version https://git-lfs.github.com/spec/v1
+oid sha256:69768c17b6112c553eff9ec01946f0d0859fe003cc354e5a96e2fb6955deba02
 size 4831850776