pytorch
/

gemma-3-12b-it-AWQ-INT4

Image-Text-to-Text

text-generation-inference

Model card Files Files and versions

jerryzh168 commited on Oct 10

Commit

6be1d49

·

verified ·

1 Parent(s): 8dc599c

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -255,7 +255,7 @@ lm_eval --model hf --model_args pretrained=$MODEL --tasks mmlu --device cuda:0 -
 | Benchmark                        |                        |                                |                                 |
 |----------------------------------|------------------------|--------------------------------|---------------------------------|
 |                                  | google/gemma-3-12b-it  | jerryzh168/gemma-3-12b-it-INT4 | pytorch/gemma-3-12b-it-AWQ-INT4 |
-| Peak Memory (GB)                 | 24.50	                | 8.57 (65% reduction)           | 12.71 (48% reduction)           |
 Note: jerryzh168/gemma-3-12b-it-INT4 is the H100 optimized checkpoint for INT4

 | Benchmark                        |                        |                                |                                 |
 |----------------------------------|------------------------|--------------------------------|---------------------------------|
 |                                  | google/gemma-3-12b-it  | jerryzh168/gemma-3-12b-it-INT4 | pytorch/gemma-3-12b-it-AWQ-INT4 |
+| Peak Memory (GB)                 | 24.50	                | 8.57 (65% reduction)           | 12.60 (49% reduction)           |
 Note: jerryzh168/gemma-3-12b-it-INT4 is the H100 optimized checkpoint for INT4