atamyrat commited on
Commit
7b7adba
1 Parent(s): 282a686

updated README (RoPE transpose issue is fixed at inference code)

Browse files
Files changed (1) hide show
  1. README.md +35 -8
README.md CHANGED
@@ -2,18 +2,45 @@
2
  license: llama2
3
  ---
4
 
5
- ## experimental llama2-7b-4bit-awq quantized model for llama2.c
6
 
7
- Source model: llama2-7b-chat + [AWQ](https://github.com/mit-han-lab/llm-awq) scales: https://huggingface.co/datasets/mit-han-lab/awq-model-zoo/blob/main/llama-2-7b-chat-w4-g128.pt
8
 
9
- Export script: https://github.com/atamurad/llama2.c/blob/int4-avx2/export_awq.py
10
-
11
- Known issue: works only for ~20 tokens, model will be fixed/updated soon.
12
 
13
  Inference code: https://github.com/atamurad/llama2.c/tree/int4-avx2
14
 
15
  ## Sample usage/prompt format:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  ```
17
- ./run llama2-7b-4bit-awq/llama2-7b-chat.awq -i "[INST]say hi[/INST]"
18
- [INST]say hi[/INST] Hello! It's nice to meet you! How are you today?
19
- ```
 
2
  license: llama2
3
  ---
4
 
5
+ # Llama2-7B 4-bit quantized model for llama2.c (experimental)
6
 
7
+ Source model: llama2-7b-chat + [AWQ](https://github.com/mit-han-lab/llm-awq) quantized with precomputed scales from: https://huggingface.co/datasets/mit-han-lab/awq-model-zoo/blob/main/llama-2-7b-chat-w4-g128.pt
8
 
9
+ Script used to export the model: https://github.com/atamurad/llama2.c/blob/int4-avx2/export_awq.py
 
 
10
 
11
  Inference code: https://github.com/atamurad/llama2.c/tree/int4-avx2
12
 
13
  ## Sample usage/prompt format:
14
+
15
+ Hello World:
16
+ ```
17
+ ./run llama2-7b-4bit-awq/llama2-7b-chat.awq -p 0.0 -i "[INST] say hi [/INST]"
18
+ [INST] say hi [/INST] Sure, I'd be happy to say hi to you! *smiling face* How are you today? Is there anything you'd like to chat about or ask me? I'm here to help with any questions you may have.
19
+
20
+ Feel free to start a conversation or ask me anything, I'm here to assist you! *hi five*
21
+ ```
22
+
23
+ Sample #2:
24
+ ```
25
+ ./run llama2-7b-4bit-awq/llama2-7b-chat.awq -p 0.9 -t 0.0 -i "[INST] write a poem about math [/INST]"
26
+ [INST] write a poem about math [/INST] Sure! Here's a poem about math:
27
+ nobody knows the secrets I hold
28
+ In numbers and formulas, I'm told
29
+ From pi to infinity, I'm the key
30
+ To unlocking the mysteries of the universe
31
+
32
+ I'm the language of logic, the voice of reason
33
+ The rhythm of numbers, the beat of creation
34
+ From geometry to calculus, I'm the way
35
+ To unravel the mysteries of the universe today
36
+
37
+ I'm the bridge between the known and unknown
38
+ The bridge that connects the infinite and the one
39
+ I'm the math that makes the world go round
40
+ The math that keeps the universe profound
41
+
42
+ So here's to math, the language of the mind
43
+ The language that makes the universe design
44
+ I'll keep on solving, keep on exploring
45
+ For math is the key to unlocking forever.
46
  ```