File size: 977 Bytes
85db336 f190fb9 85db336 f190fb9 85db336 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
---
license: mit
language:
- en
base_model:
- Qwen/Qwen1.5-7B-Chat
pipeline_tag: text-generation
tags:
- chat
---
# mistralai/Mistral-7B-Instruct-v0.3
- ## Introduction
- Quantization Tool: Quark 0.6.0
- OGA Model Builder: v0.5.1
- ## Quantization Strategy
- AWQ / Group 128 / Asymmetric / UINT4 Weights / FP16 activations
- Excluded Layers: None
```
python3 quantize_quark.py \
--model_dir "$model" \
--output_dir "$output_dir" \
--quant_scheme w_uint4_per_group_asym \
--num_calib_data 128 \
--quant_algo awq \
--dataset pileval_for_awq_benchmark \
--seq_len 512 \
--model_export quark_safetensors \
--data_type float16 \
--exclude_layers [] \
--custom_mode awq
```
- ## OGA Model Builder
```
python builder.py \
-i <quantized safetensor model dir> \
-o <oga model output dir> \
-p int4 \
-e dml
```
- PostProcessed to generate Hybrid Model
-
|