Transformers
PyTorch
Inference Endpoints
norabelrose commited on
Commit
1c797af
1 Parent(s): 925b406

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. README.md +12 -0
  2. config.json +10 -0
  3. pytorch_model.bin +3 -0
README.md ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ Mamba-2.8b is a model using the [Mamba](https://arxiv.org/abs/2312.00752) architecture, with 2.8B parameters, trained on the Pile dataset.
5
+
6
+ Model code: https://github.com/state-spaces/mamba/tree/main
7
+
8
+ To load the model, follow the installation instruction in the code repo, and then:
9
+ ```
10
+ from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel
11
+ model = MambaLMHeadModel.from_pretrained("EleutherAI/Hermes-mamba-2.8b")
12
+ ```
config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "d_model": 2560,
3
+ "n_layer": 64,
4
+ "vocab_size": 50277,
5
+ "ssm_cfg": {},
6
+ "rms_norm": true,
7
+ "residual_in_fp32": true,
8
+ "fused_add_norm": true,
9
+ "pad_vocab_size_multiple": 8
10
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4abc4ff9d1fd8abfbe21ca47b716278d608ee475b654e5b2d8bb4ce958536c90
3
+ size 5548078554