haoyang-amd
/

ts_model

Model card Files Files and versions Community

haoyang-amd commited on Oct 7, 2024

Commit

243be4a

verified ·

1 Parent(s): a5c42b0

Update README.md

Browse files

Files changed (1) hide show

README.md +11 -9

README.md CHANGED Viewed

@@ -1,10 +1,12 @@
 ---
-base_model: Qwen2-7B
 license: other
 license_name: llama2
 license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2/LICENSE
 ---
-# Qwen2-7B-Weight-INT4-Per-Group-AWQ-Bfloat16
 - ## Introduction
   This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
 - ## Quantization Stragegy
@@ -15,7 +17,7 @@ license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2
 1. [Download and install Quark](https://quark.docs.amd.com/latest/install.html)
 2. Run the quantization script in the example folder using the following command line:
     ```sh
-    export MODEL_DIR = [local model checkpoint folder] or Qwen/Qwen2-7B
     # single GPU
     python3 quantize_quark.py --model_dir $MODEL_DIR \
                               --data_type bfloat16 \
@@ -24,7 +26,7 @@ license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2
                               --quant_algo awq \
                               --dataset pileval_for_awq_benchmark \
                               --seq_len 512 \
-                              --output_dir Qwen2-7B-W_Int4-Per_Group-AWQ-BFloat16 \
                               --model_export quark_safetensors
     # cpu
     python3 quantize_quark.py --model_dir $MODEL_DIR \
@@ -34,7 +36,7 @@ license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2
                               --quant_algo awq \
                               --dataset pileval_for_awq_benchmark \
                               --seq_len 512 \
-                              --output_dir Qwen2-7B-W_Int4-Per_Group-AWQ-BFloat16 \
                               --model_export quark_safetensors \
                               --device cpu
     ```
@@ -48,17 +50,17 @@ The quantization evaluation results are conducted in pseudo-quantization mode, w
   <tr>
    <td><strong>Benchmark</strong>
    </td>
-   <td><strong>Qwen2-7B(Bfloat16) </strong>
    </td>
-   <td><strong>Qwen2-7B-Weight-INT4-Per-Group-AWQ-Bfloat16(this model)</strong>
    </td>
   </tr>
   <tr>
    <td>Perplexity-wikitext2
    </td>
-   <td>7.9935
    </td>
-   <td>8.1179
    </td>
   </tr>
 </table>

 ---
+base_model:
+- microsoft/Phi-3-mini-4k-instruct
 license: other
 license_name: llama2
 license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2/LICENSE
 ---
+# Phi-3-mini-4k-instruct-Weight-INT4-Per-Group-AWQ-Bfloat16
 - ## Introduction
   This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
 - ## Quantization Stragegy
 1. [Download and install Quark](https://quark.docs.amd.com/latest/install.html)
 2. Run the quantization script in the example folder using the following command line:
     ```sh
+    export MODEL_DIR = [local model checkpoint folder] or microsoft/Phi-3-mini-4k-instruct
     # single GPU
     python3 quantize_quark.py --model_dir $MODEL_DIR \
                               --data_type bfloat16 \
                               --quant_algo awq \
                               --dataset pileval_for_awq_benchmark \
                               --seq_len 512 \
+                              --output_dir Phi-3-mini-4k-instruct-W_Int4-Per_Group-AWQ-BFloat16 \
                               --model_export quark_safetensors
     # cpu
     python3 quantize_quark.py --model_dir $MODEL_DIR \
                               --quant_algo awq \
                               --dataset pileval_for_awq_benchmark \
                               --seq_len 512 \
+                              --output_dir Phi-3-mini-4k-instruct-W_Int4-Per_Group-AWQ-BFloat16 \
                               --model_export quark_safetensors \
                               --device cpu
     ```
   <tr>
    <td><strong>Benchmark</strong>
    </td>
+   <td><strong>Phi-3-mini-4k-instruct(Bfloat16) </strong>
    </td>
+   <td><strong>Phi-3-mini-4k-instruct-Weight-INT4-Per-Group-AWQ-Bfloat16(this model)</strong>
    </td>
   </tr>
   <tr>
    <td>Perplexity-wikitext2
    </td>
+   <td>6.0164
    </td>
+   <td>6.5575
    </td>
   </tr>
 </table>