Meta-Llama-3.1-8B-Instruct-AWQ
The following command was used to produce this model.
python quantize.py --model_dir /Meta-Llama-3.1-8B-Instruct \
--output_dir /Meta-Llama-3.1-8B-Instruct-AWQ \
--dtype bfloat16 \
--qformat int4_awq \
--awq_block_size 64