Update README.md
Browse files
README.md
CHANGED
@@ -6,4 +6,21 @@ This is a preliminary HuggingFace implementation of the newly released MoE model
|
|
6 |
|
7 |
Thanks to @dzhulgakov for his early implementation (https://github.com/dzhulgakov/llama-mistral) that helped me find a working setup.
|
8 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
Come chat about this in our [Disco(rd)](https://discord.gg/S8W8B5nz3v)! :)
|
|
|
6 |
|
7 |
Thanks to @dzhulgakov for his early implementation (https://github.com/dzhulgakov/llama-mistral) that helped me find a working setup.
|
8 |
|
9 |
+
# Basic Inference setup
|
10 |
+
|
11 |
+
```python
|
12 |
+
import torch
|
13 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
14 |
+
|
15 |
+
model = AutoModelForCausalLM.from_pretrained("DiscoResearch/mixtral-7b-8expert", low_cpu_mem_usage=True, device_map="auto", trust_remote_code=True)
|
16 |
+
tok = AutoTokenizer.from_pretrained("DiscoResearch/mixtral-7b-8expert")
|
17 |
+
x = tok.encode("The mistral wind in is a phenomenon ", return_tensors="pt").cuda()
|
18 |
+
x = model.generate(x, max_new_tokens=128).cpu()
|
19 |
+
print(tok.batch_decode(x))
|
20 |
+
```
|
21 |
+
|
22 |
+
# Conversion
|
23 |
+
|
24 |
+
Use `convert_mistral_moe_weights_to_hf.py --input_dir ./input_dir --model_size 7B --output_dir ./output` to convert the original consolidated weights to this HF setup.
|
25 |
+
|
26 |
Come chat about this in our [Disco(rd)](https://discord.gg/S8W8B5nz3v)! :)
|