Gemma 3
Collection
Collection of quants for Google's Gemma 3
โข
3 items
โข
Updated
This repository contains GGUF quantized versions of Google's Gemma 3 1B instruction-tuned model, optimized for efficient deployment across various hardware configurations.
Model | Size | Compression Ratio | Size Reduction |
---|---|---|---|
Q8_0 | 1.07 GB | 54% | 46% |
Q6_K | 1.01 GB | 51% | 49% |
Q4_K | 0.81 GB | 40% | 60% |
Q2_K | 0.69 GB | 34% | 66% |
These models can be used with llama.cpp and its various interfaces. Example:
# Running with llama-gemma3-cli.exe (adjust paths as needed)
./llama-gemma3-cli --model Google.Gemma-3-1b-it-Q4_K.gguf --ctx-size 4096 --temp 0.7 --prompt "Write a short story about a robot who discovers it has feelings."
This model is released under the same Gemma license as the original model.
This quantized set is derived from Google's Gemma 3 1B instruction-tuned model.
@article{gemma_2025,
title={Gemma 3},
url={https://goo.gle/Gemma3Report},
publisher={Kaggle},
author={Gemma Team},
year={2025}
}
@misc{gemma3_quantization_2025,
title={Quantized Versions of Google's Gemma 3 1B Model},
author={Lex-au},
year={2025},
month={March},
note={Quantized models (Q8_0, Q6_K, Q4_K, Q2_K) derived from Google's Gemma 3 1B},
url={https://huggingface.co/lex-au}
}