atamyrat
commited on
Commit
•
7b7adba
1
Parent(s):
282a686
updated README (RoPE transpose issue is fixed at inference code)
Browse files
README.md
CHANGED
@@ -2,18 +2,45 @@
|
|
2 |
license: llama2
|
3 |
---
|
4 |
|
5 |
-
|
6 |
|
7 |
-
Source model: llama2-7b-chat + [AWQ](https://github.com/mit-han-lab/llm-awq) scales: https://huggingface.co/datasets/mit-han-lab/awq-model-zoo/blob/main/llama-2-7b-chat-w4-g128.pt
|
8 |
|
9 |
-
|
10 |
-
|
11 |
-
Known issue: works only for ~20 tokens, model will be fixed/updated soon.
|
12 |
|
13 |
Inference code: https://github.com/atamurad/llama2.c/tree/int4-avx2
|
14 |
|
15 |
## Sample usage/prompt format:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
```
|
17 |
-
./run llama2-7b-4bit-awq/llama2-7b-chat.awq -i "[INST]say hi[/INST]"
|
18 |
-
[INST]say hi[/INST] Hello! It's nice to meet you! How are you today?
|
19 |
-
```
|
|
|
2 |
license: llama2
|
3 |
---
|
4 |
|
5 |
+
# Llama2-7B 4-bit quantized model for llama2.c (experimental)
|
6 |
|
7 |
+
Source model: llama2-7b-chat + [AWQ](https://github.com/mit-han-lab/llm-awq) quantized with precomputed scales from: https://huggingface.co/datasets/mit-han-lab/awq-model-zoo/blob/main/llama-2-7b-chat-w4-g128.pt
|
8 |
|
9 |
+
Script used to export the model: https://github.com/atamurad/llama2.c/blob/int4-avx2/export_awq.py
|
|
|
|
|
10 |
|
11 |
Inference code: https://github.com/atamurad/llama2.c/tree/int4-avx2
|
12 |
|
13 |
## Sample usage/prompt format:
|
14 |
+
|
15 |
+
Hello World:
|
16 |
+
```
|
17 |
+
./run llama2-7b-4bit-awq/llama2-7b-chat.awq -p 0.0 -i "[INST] say hi [/INST]"
|
18 |
+
[INST] say hi [/INST] Sure, I'd be happy to say hi to you! *smiling face* How are you today? Is there anything you'd like to chat about or ask me? I'm here to help with any questions you may have.
|
19 |
+
|
20 |
+
Feel free to start a conversation or ask me anything, I'm here to assist you! *hi five*
|
21 |
+
```
|
22 |
+
|
23 |
+
Sample #2:
|
24 |
+
```
|
25 |
+
./run llama2-7b-4bit-awq/llama2-7b-chat.awq -p 0.9 -t 0.0 -i "[INST] write a poem about math [/INST]"
|
26 |
+
[INST] write a poem about math [/INST] Sure! Here's a poem about math:
|
27 |
+
nobody knows the secrets I hold
|
28 |
+
In numbers and formulas, I'm told
|
29 |
+
From pi to infinity, I'm the key
|
30 |
+
To unlocking the mysteries of the universe
|
31 |
+
|
32 |
+
I'm the language of logic, the voice of reason
|
33 |
+
The rhythm of numbers, the beat of creation
|
34 |
+
From geometry to calculus, I'm the way
|
35 |
+
To unravel the mysteries of the universe today
|
36 |
+
|
37 |
+
I'm the bridge between the known and unknown
|
38 |
+
The bridge that connects the infinite and the one
|
39 |
+
I'm the math that makes the world go round
|
40 |
+
The math that keeps the universe profound
|
41 |
+
|
42 |
+
So here's to math, the language of the mind
|
43 |
+
The language that makes the universe design
|
44 |
+
I'll keep on solving, keep on exploring
|
45 |
+
For math is the key to unlocking forever.
|
46 |
```
|
|
|
|
|
|