--- license: llama2 language: - en pipeline_tag: text-generation tags: - llama - llama2 - amd - meta - facebook - onnx base_model: - meta-llama/Llama-2-7b-hf --- # meta-llama/Llama-2-7b-hf - ## Introduction - Quantization Tool: Quark 0.6.0 - OGA Model Builder: v0.5.1 - Postprocess - ## Quantization Strategy - AWQ / Group 128 / Asymmetric / UINT4 Weights / FP16 activations - Excluded Layers: None ``` python3 quantize_quark.py \ --model_dir "$model" \ --output_dir "$output_dir" \ --quant_scheme w_uint4_per_group_asym \ --num_calib_data 128 \ --quant_algo awq \ --dataset pileval_for_awq_benchmark \ --seq_len 512 \ --model_export quark_safetensors \ --data_type float16 \ --exclude_layers [] \ --custom_mode awq ``` - ## OGA Model Builder ``` python builder.py \ -i \ -o \ -p int4 \ -e dml ``` - PostProcessed to generate Hybrid Model -