tim-lawson
/

mlsae-pythia-70m-deduped-x64-k128-tfm

model_hub_mixin

pytorch_model_hub_mixin

Model card Files Files and versions Community

tim-lawson commited on Sep 9

Commit

81f49fd

•

1 Parent(s): 053000a

Update README.md

Files changed (1) hide show

README.md +18 -5

README.md CHANGED Viewed

@@ -3,10 +3,23 @@ language: en
 library_name: mlsae
 license: mit
 tags:
-- model_hub_mixin
-- pytorch_model_hub_mixin
 ---
-This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
-- Library: https://github.com/tslwn/mlsae
-- Docs: [More Information Needed]

 library_name: mlsae
 license: mit
 tags:
+  - model_hub_mixin
+  - pytorch_model_hub_mixin
+datasets:
+  - monology/pile-uncopyrighted
 ---
+# mlsae-pythia-70m-deduped-x64-k128-tfm
+A Multi-Layer Sparse Autoencoder (MLSAE) trained on the residual stream
+activation vectors from every layer of
+[EleutherAI/pythia-70m-deduped](https://huggingface.co/EleutherAI/pythia-70m-deduped)
+with an expansion factor of 64 and k = 128, over 1 billion tokens from
+[monology/pile-uncopyrighted](https://huggingface.co/datasets/monology/pile-uncopyrighted).
+This model includes the underlying transformer.
+For more details, see:
+- Paper: <https://arxiv.org/abs/2409.04185>
+- GitHub repository: <https://github.com/tim-lawson/mlsae>
+- Weights & Biases project: <https://wandb.ai/timlawson-/mlsae>