Edit model card

4-bit (32 groupsize) quantized files for WizardLM/WizardLM-13B-V1.1

Quantized using GPTQ-for-LLaMa.

Command used to quantize: python llama.py /my/model/directory c4 --wbits 4 --true-sequential --act-order --groupsize 32 --save_safetensors /my/output/file.safetensors

Downloads last month
1
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.