Edit model card

This model is GPTQ 4 bit quantized version of meta-llama/Llama-2-7b-hf.

Downloads last month
8
Inference API
This model can be loaded on Inference API (serverless).