Gemma 3
Collection
Collection of quants for Google's Gemma 3
โข
3 items
โข
Updated
This repository contains GGUF quantized versions of Google's Gemma 3 4B instruction-tuned model, optimized for efficient deployment across various hardware configurations.
Model | Size | Compression Ratio | Size Reduction |
---|---|---|---|
Q8_0 | 4.1 GB | 53% | 47% |
Q6_K | 3.2 GB | 41% | 59% |
Q4_K | 2.5 GB | 32% | 68% |
Q2_K | 1.7 GB | 22% | 78% |
These models can be used with llama.cpp and its various interfaces. Example:
# Running with llama-gemma3-cli.exe (adjust paths as needed)
./llama-gemma3-cli --model gemma-3-4b-it-q4k.gguf --ctx-size 4096 --temp 0.7 --prompt "Write a short story about a robot who discovers it has feelings."
This model is released under the same Gemma license as the original model.
This quantized set is derived from Google's Gemma 3 4B instruction-tuned model.
@article{gemma_2025,
title={Gemma 3},
url={https://goo.gle/Gemma3Report},
publisher={Kaggle},
author={Gemma Team},
year={2025}
}
@misc{gemma3_quantization_2025,
title={Quantized Versions of Google's Gemma 3 27B Model},
author={Lex-au},
year={2025},
month={March},
note={Quantized models (Q8_0, Q6_K, Q4_K, Q2_K) derived from Google's Gemma 3 4B},
url={https://huggingface.co/lex-au}
}