Update README.md
Browse files
README.md
CHANGED
@@ -1,10 +1,12 @@
|
|
1 |
---
|
2 |
-
base_model:
|
|
|
3 |
license: other
|
4 |
license_name: llama2
|
5 |
license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2/LICENSE
|
6 |
---
|
7 |
-
|
|
|
8 |
- ## Introduction
|
9 |
This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
|
10 |
- ## Quantization Stragegy
|
@@ -15,7 +17,7 @@ license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2
|
|
15 |
1. [Download and install Quark](https://quark.docs.amd.com/latest/install.html)
|
16 |
2. Run the quantization script in the example folder using the following command line:
|
17 |
```sh
|
18 |
-
export MODEL_DIR = [local model checkpoint folder] or
|
19 |
# single GPU
|
20 |
python3 quantize_quark.py --model_dir $MODEL_DIR \
|
21 |
--data_type bfloat16 \
|
@@ -24,7 +26,7 @@ license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2
|
|
24 |
--quant_algo awq \
|
25 |
--dataset pileval_for_awq_benchmark \
|
26 |
--seq_len 512 \
|
27 |
-
--output_dir
|
28 |
--model_export quark_safetensors
|
29 |
# cpu
|
30 |
python3 quantize_quark.py --model_dir $MODEL_DIR \
|
@@ -34,7 +36,7 @@ license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2
|
|
34 |
--quant_algo awq \
|
35 |
--dataset pileval_for_awq_benchmark \
|
36 |
--seq_len 512 \
|
37 |
-
--output_dir
|
38 |
--model_export quark_safetensors \
|
39 |
--device cpu
|
40 |
```
|
@@ -48,17 +50,17 @@ The quantization evaluation results are conducted in pseudo-quantization mode, w
|
|
48 |
<tr>
|
49 |
<td><strong>Benchmark</strong>
|
50 |
</td>
|
51 |
-
<td><strong>
|
52 |
</td>
|
53 |
-
<td><strong>
|
54 |
</td>
|
55 |
</tr>
|
56 |
<tr>
|
57 |
<td>Perplexity-wikitext2
|
58 |
</td>
|
59 |
-
<td>
|
60 |
</td>
|
61 |
-
<td>
|
62 |
</td>
|
63 |
</tr>
|
64 |
</table>
|
|
|
1 |
---
|
2 |
+
base_model:
|
3 |
+
- microsoft/Phi-3-mini-4k-instruct
|
4 |
license: other
|
5 |
license_name: llama2
|
6 |
license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2/LICENSE
|
7 |
---
|
8 |
+
|
9 |
+
# Phi-3-mini-4k-instruct-Weight-INT4-Per-Group-AWQ-Bfloat16
|
10 |
- ## Introduction
|
11 |
This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
|
12 |
- ## Quantization Stragegy
|
|
|
17 |
1. [Download and install Quark](https://quark.docs.amd.com/latest/install.html)
|
18 |
2. Run the quantization script in the example folder using the following command line:
|
19 |
```sh
|
20 |
+
export MODEL_DIR = [local model checkpoint folder] or microsoft/Phi-3-mini-4k-instruct
|
21 |
# single GPU
|
22 |
python3 quantize_quark.py --model_dir $MODEL_DIR \
|
23 |
--data_type bfloat16 \
|
|
|
26 |
--quant_algo awq \
|
27 |
--dataset pileval_for_awq_benchmark \
|
28 |
--seq_len 512 \
|
29 |
+
--output_dir Phi-3-mini-4k-instruct-W_Int4-Per_Group-AWQ-BFloat16 \
|
30 |
--model_export quark_safetensors
|
31 |
# cpu
|
32 |
python3 quantize_quark.py --model_dir $MODEL_DIR \
|
|
|
36 |
--quant_algo awq \
|
37 |
--dataset pileval_for_awq_benchmark \
|
38 |
--seq_len 512 \
|
39 |
+
--output_dir Phi-3-mini-4k-instruct-W_Int4-Per_Group-AWQ-BFloat16 \
|
40 |
--model_export quark_safetensors \
|
41 |
--device cpu
|
42 |
```
|
|
|
50 |
<tr>
|
51 |
<td><strong>Benchmark</strong>
|
52 |
</td>
|
53 |
+
<td><strong>Phi-3-mini-4k-instruct(Bfloat16) </strong>
|
54 |
</td>
|
55 |
+
<td><strong>Phi-3-mini-4k-instruct-Weight-INT4-Per-Group-AWQ-Bfloat16(this model)</strong>
|
56 |
</td>
|
57 |
</tr>
|
58 |
<tr>
|
59 |
<td>Perplexity-wikitext2
|
60 |
</td>
|
61 |
+
<td>6.0164
|
62 |
</td>
|
63 |
+
<td>6.5575
|
64 |
</td>
|
65 |
</tr>
|
66 |
</table>
|