Gemma-2-2B-it-4Bit-GPTQ

Quantization

  • This model was quantized with the Auto-GPTQ library and dataset containing english and russian wikipedia articles. It has lower perplexity on russian data then other GPTQ models.
Model bits Perplexity (russian wiki)
gemma-2-9b-it 16bit 6.2152
Granther/Gemma-2-9B-Instruct-4Bit-GPTQ 4bit 6.4966
this model 4bit 6.3593
Downloads last month
5
Safetensors
Model size
2.03B params
Tensor type
I32
·
FP16
·
Inference Examples
Inference API (serverless) has been turned off for this model.

Model tree for qilowoq/gemma-2-9B-it-4Bit-GPTQ

Base model

google/gemma-2-9b
Quantized
(104)
this model