amd
/

DeepSeek-R1-Distill-Qwen-1.5B-awq-asym-uint4-g128-lmhead-onnx-hybrid

ONNX

English

Model card Files Files and versions Community

pooja-ganesh commited on 15 days ago

Commit

18e759a

verified ·

1 Parent(s): 7c4f159

Update README.md

Browse files

Files changed (1) hide show

README.md +62 -0

README.md CHANGED Viewed

@@ -2,5 +2,67 @@
 license: mit
 base_model:
 - deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
 ---

 license: mit
 base_model:
 - deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
+language:
+- en
 ---
+# DeepSeek-R1-Distill-Qwen2.5-Math-1.5B-awq-asym-uint4-g128-lmhead-onnx-hybrid
+- ## Introduction
+  This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
+- ## Quantization Strategy
+  - ***Quantized Layers***: All linear layers
+  - ***Weight***: uint4 asymmetric per-group, group_size=128
+- ## Quick Start
+1. [Download and install Quark](https://quark.docs.amd.com/latest/install.html)
+2. Run the quantization script in the example folder using the following command line:
+    ```sh
+    export MODEL_DIR = [local model checkpoint folder] or Qwen2.5-Math-1.5B
+    # single GPU
+    python quantize_quark.py --model_dir $MODEL_DIR \
+                            --output_dir output_dir $MODEL_NAME-awq-asym-uint4-g128-lmhead \
+                            --quant_scheme w_uint4_per_group_asym \
+                            --num_calib_data 128 \
+                            --quant_algo awq \
+                            --dataset pileval_for_awq_benchmark \
+                            --seq_len 512 \
+                            --model_export hf_format \
+                            --data_type bfloat16 \
+                            --exclude_layers
+    # cpu
+    python quantize_quark.py --model_dir $MODEL_DIR \
+                            --output_dir output_dir $MODEL_NAME-awq-asym-uint4-g128-lmhead \
+                            --quant_scheme w_uint4_per_group_asym \
+                            --num_calib_data 128 \
+                            --quant_algo awq \
+                            --dataset pileval_for_awq_benchmark \
+                            --seq_len 512 \
+                            --model_export hf_format \
+                            --data_type bfloat16 \
+                            --exclude_layers \
+                            --device cpu
+    ```
+#### License
+Modifications copyright(c) 2024 Advanced Micro Devices,Inc. All rights reserved.
+MIT License
+Copyright (c) 2023 DeepSeek
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.