|
--- |
|
license: other |
|
license_name: gemma-terms-of-use |
|
license_link: https://ai.google.dev/gemma/terms |
|
tags: |
|
- gemma |
|
- gguf |
|
--- |
|
|
|
# Gemma 2B Instruct GGUF |
|
|
|
Contains Q4 & Q8 quantized GGUFs for [google/gemma](https://huggingface.co/collections/google/gemma-release-65d5efbccdbb8c4202ec078b) |
|
|
|
## Perf |
|
|
|
| Variant | Device | Perf | |
|
| - | - | - | |
|
| Q4 | M1 Pro 10-core GPU | 90 tok/s | |
|
| | Snapdragon 778G CPU | 10 tok/s | |
|
| | RTX 2070S | 40 tok/s | |
|
| Q8 | M1 Pro 10-core GPU | 54 tok/s | |
|
| | Snapdragon 778G CPU | 6 tok/s | |
|
| | RTX 2070S | 25 tok/s | |
|
| F16 | M1 Pro 10-core GPU | 30 tok/s| |
|
| | Snapdragon 778G CPU | <1 tok/s | |