---
license: llama2
language:
- en
pipeline_tag: text-generation
tags:
- llama
- llama2
- amd
- meta
- facebook
- onnx
base_model:
- meta-llama/Llama-2-7b-hf
---

# meta-llama/Llama-2-7b-hf
- ## Introduction
  - Quantization Tool: Quark 0.6.0
  - OGA Model Builder: v0.5.1
  - Postprocess
- ## Quantization Strategy
  - AWQ / Group 128 / Asymmetric / UINT4 Weights / FP16 activations
  - Excluded Layers: None
  ```
  python3 quantize_quark.py \
        --model_dir "$model" \
        --output_dir "$output_dir" \
        --quant_scheme w_uint4_per_group_asym \
        --num_calib_data 128 \
        --quant_algo awq \
        --dataset pileval_for_awq_benchmark \
        --seq_len 512 \
        --model_export quark_safetensors \
        --data_type float16 \
        --exclude_layers [] \
        --custom_mode awq
  ```
- ## OGA Model Builder
  ```
  python builder.py \
    -i <quantized safetensor model dir> \
    -o <oga model output dir> \
    -p int4 \
    -e dml
  ```
- PostProcessed to generate Hybrid Model
-