Create README.md (#1)
Browse files- Create README.md (d09db1929bd60464843e55ad7df89f2afba6f138)
README.md
ADDED
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
pipeline_tag: text-generation
|
3 |
+
library_name: mlx
|
4 |
+
inference: false
|
5 |
+
tags:
|
6 |
+
- facebook
|
7 |
+
- meta
|
8 |
+
- llama
|
9 |
+
- llama-2
|
10 |
+
- mlx
|
11 |
+
license: llama2
|
12 |
+
---
|
13 |
+
|
14 |
+
# **Llama 2**
|
15 |
+
|
16 |
+
Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7B fine-tuned model, in `npz` format suitable for use in Apple's MLX framework.
|
17 |
+
|
18 |
+
Weights have been converted to `float16` from the original `bfloat16` type, because `numpy` is not compatible with `bfloat16` out of the box.
|
19 |
+
|
20 |
+
How to use with [MLX](https://github.com/ml-explore/mlx).
|
21 |
+
|
22 |
+
```bash
|
23 |
+
# Install mlx, mlx-examples, huggingface-cli
|
24 |
+
pip install mlx
|
25 |
+
pip install huggingface_hub hf_transfer
|
26 |
+
git clone https://github.com/ml-explore/mlx-examples.git
|
27 |
+
|
28 |
+
# Download model
|
29 |
+
export HF_HUB_ENABLE_HF_TRANSFER=1
|
30 |
+
huggingface-cli download --local-dir Llama-2-7b-chat-mlx mlx-llama/Llama-2-7b-chat-mlx
|
31 |
+
|
32 |
+
# Run example
|
33 |
+
python mlx-examples/llama/llama.py --prompt "My name is " Llama-2-7b-chat-mlx/ Llama-2-7b-chat-mlx/tokenizer.model
|
34 |
+
```
|
35 |
+
|
36 |
+
Please, refer to the [original model card](https://huggingface.co/meta-llama/Llama-2-7b-chat) for details on Llama 2.
|
37 |
+
|