RonanMcGovern commited on
Commit
e01e96e
1 Parent(s): 379b321

add quantization benchmarking

Browse files
Files changed (1) hide show
  1. README.md +19 -0
README.md CHANGED
@@ -20,6 +20,25 @@ vLLM compatible model that will run in:
20
  [One click Runpod template](https://runpod.io/console/deploy?template=rzgcdh9rqe&ref=jmfkcdio) (affiliate link).
21
  Other templates available from [Trelis' one-click-llms repo](https://github.com/TrelisResearch/one-click-llms).
22
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
  ---
24
  # Phi-4
25
 
 
20
  [One click Runpod template](https://runpod.io/console/deploy?template=rzgcdh9rqe&ref=jmfkcdio) (affiliate link).
21
  Other templates available from [Trelis' one-click-llms repo](https://github.com/TrelisResearch/one-click-llms).
22
 
23
+ ## Quantization Benchmarking
24
+ Run using llm_eval on 100 rows of gsm8k.
25
+
26
+ Base model - 16 bit:
27
+ ```
28
+ |Tasks|Version| Filter |n-shot| Metric | |Value| |Stderr|
29
+ |-----|------:|----------------|-----:|-----------|---|----:|---|-----:|
30
+ |gsm8k| 3|flexible-extract| 5|exact_match|↑ | 0.93|± |0.0256|
31
+ | | |strict-match | 5|exact_match|↑ | 0.93|± |0.0256|
32
+ ```
33
+
34
+ fp8 model:
35
+ ```
36
+ |Tasks|Version| Filter |n-shot| Metric | |Value| |Stderr|
37
+ |-----|------:|----------------|-----:|-----------|---|----:|---|-----:|
38
+ |gsm8k| 3|flexible-extract| 5|exact_match|↑ | 0.93|± |0.0256|
39
+ | | |strict-match | 5|exact_match|↑ | 0.93|± |0.0256|
40
+ ```
41
+
42
  ---
43
  # Phi-4
44