4-bit (32 groupsize) quantized files for WizardLM/WizardLM-13B-V1.1
Quantized using GPTQ-for-LLaMa.
Command used to quantize: python llama.py /my/model/directory c4 --wbits 4 --true-sequential --act-order --groupsize 32 --save_safetensors /my/output/file.safetensors
- Downloads last month
- 1
This model does not have enough activity to be deployed to Inference API (serverless) yet.
Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.