What's going on with the PPL on the BF16 gemma 4 e2b?

by an0nya - opened 5 days ago

Model | Size | BPW | PPL | Speed
BF16 (original) | 8.67 GB | 16.00 | 154.0 | 4.2 t/s
ggml Q2_K + iMatrix | 2.77 GB | 5.12 | 89.1 | 14.0 t/s
HPC Q2_K + Q4_0·Shor | 1.44 GB | ~3.0 | 129.6 | 18.1 t/s

CompressedGemma

Owner 5 days ago

Measurement artifact from a previous version that started with a safetensor base; I was lazy about updating the readme because I didn't expect anybody to actually show any interest in using this utility or even finding it.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment