tim-lawson
/

mlsae-pythia-70m-deduped-x256-k32-tfm

model_hub_mixin

pytorch_model_hub_mixin

Model card Files Files and versions Community

mlsae-pythia-70m-deduped-x256-k32-tfm / README.md

tim-lawson's picture

Update README.md

ce887e4 verified 3 months ago

|

history blame contribute delete

792 Bytes

	---
	language: en
	library_name: mlsae
	license: mit
	tags:
	- model_hub_mixin
	- pytorch_model_hub_mixin
	datasets:
	- monology/pile-uncopyrighted
	---

	# mlsae-pythia-70m-deduped-x256-k32-tfm

	A Multi-Layer Sparse Autoencoder (MLSAE) trained on the residual stream
	activation vectors from every layer of
	[EleutherAI/pythia-70m-deduped](https://huggingface.co/EleutherAI/pythia-70m-deduped)
	with an expansion factor of 256 and k = 32, over 1 billion tokens from
	[monology/pile-uncopyrighted](https://huggingface.co/datasets/monology/pile-uncopyrighted).
	This model includes the underlying transformer.

	For more details, see:

	- Paper: <https://arxiv.org/abs/2409.04185>
	- GitHub repository: <https://github.com/tim-lawson/mlsae>
	- Weights & Biases project: <https://wandb.ai/timlawson-/mlsae>