Based on https://huggingface.co/Fsoft-AIC/CodeCapybara Using https://github.com/qwopqwop200/GPTQ-for-LLaMa triton branch python llama.py CodeCapybara/ c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save_safetensors codecapybara-4bit-128g-gptq.safetensors