cthiriet commited on
Commit
2133956
1 Parent(s): cf86e12

Add README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ tags: []
4
+ ---
5
+
6
+ # mambaoutai
7
+
8
+ # Usage
9
+
10
+ You need to install `transformers` from `main` until `transformers=4.39.0` is released.
11
+
12
+ ```bash
13
+ pip install git+https://github.com/huggingface/transformers@main
14
+ ```
15
+
16
+ We also recommend you to install both `causal_conv_1d` and `mamba-ssm` using:
17
+
18
+ ```bash
19
+ pip install causal-conv1d>=1.2.0
20
+ pip install mamba-ssm
21
+ ```
22
+
23
+ If any of these two is not installed, the "eager" implementation will be used. Otherwise the more optimised `cuda` kernels will be used.
24
+
25
+ ## Generation
26
+
27
+ Use this snippet of code to generate text from the model:
28
+
29
+ ```python
30
+ from transformers import MambaConfig, MambaForCausalLM, AutoTokenizer
31
+ import torch
32
+
33
+ tokenizer = AutoTokenizer.from_pretrained("lightonai/mambaoutai")
34
+ model = MambaForCausalLM.from_pretrained("lightonai/mambaoutai")
35
+ input_ids = tokenizer("What is a mamba?", return_tensors="pt")["input_ids"]
36
+
37
+ out = model.generate(input_ids, max_new_tokens=10)
38
+ print(tokenizer.batch_decode(out))
39
+ ```
40
+
41
+ ## Training checkpoints
42
+
43
+ You can find some of the training checkpoints in the repo branch. On branch corresponding to the model at some point in time during training.
44
+
45
+ You can do inference with these training checkpoints by adding the `revision` parameter to the `from_pretrained` method. For example, to load the model checkpoint after 30000 steps of pretraining, you can use the following code:
46
+
47
+ ```python
48
+ from transformers import MambaConfig, MambaForCausalLM, AutoTokenizer
49
+ import torch
50
+
51
+ tokenizer = AutoTokenizer.from_pretrained("lightonai/mambaoutai", revision="pre-30000")
52
+ model = MambaForCausalLM.from_pretrained("lightonai/mambaoutai", revision="pre-30000")
53
+ input_ids = tokenizer("What is a mamba?", return_tensors="pt")["input_ids"]
54
+
55
+ out = model.generate(input_ids, max_new_tokens=10)
56
+ print(tokenizer.batch_decode(out))
57
+ ```