JaaackXD commited on
Commit
a0ff966
1 Parent(s): dad05ff

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -0
README.md CHANGED
@@ -14,6 +14,37 @@ Including the original LLaMA 3 models file cloning from the Meta HF repo. (https
14
 
15
  If you have issues downloading the models from Meta or converting models for `llama.cpp`, feel free to download this one!
16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model [README](https://github.com/meta-llama/llama3). For more technical information about generation parameters and recipes for how to use Llama 3 in applications, please go [here](https://github.com/meta-llama/llama-recipes).
18
 
19
  ## License
 
14
 
15
  If you have issues downloading the models from Meta or converting models for `llama.cpp`, feel free to download this one!
16
 
17
+ ## Perplexity table on LLaMA 3 70B
18
+
19
+ Less perplexity is better. (credit to: [dranger003](https://github.com/ggerganov/llama.cpp/pull/6745#issuecomment-2093892514))
20
+
21
+ | Quantization | Size (GiB) | Perplexity (wiki.test) | Delta (FP16)|
22
+ |--------------|------------|------------------------|-------------|
23
+ | IQ1_S | 14.29 | 9.8655 +/- 0.0625 | 248.51% |
24
+ | IQ1_M | 15.60 | 8.5193 +/- 0.0530 | 201.94% |
25
+ | IQ2_XXS | 17.79 | 6.6705 +/- 0.0405 | 135.64% |
26
+ | IQ2_XS | 19.69 | 5.7486 +/- 0.0345 | 103.07% |
27
+ | IQ2_S | 20.71 | 5.5215 +/- 0.0318 | 95.05% |
28
+ | Q2_K_S | 22.79 | 5.4334 +/- 0.0325 | 91.94% |
29
+ | IQ2_M | 22.46 | 4.8959 +/- 0.0276 | 72.35% |
30
+ | Q2_K | 24.56 | 4.7763 +/- 0.0274 | 68.73% |
31
+ | IQ3_XXS | 25.58 | 3.9671 +/- 0.0211 | 40.14% |
32
+ | IQ3_XS | 27.29 | 3.7210 +/- 0.0191 | 31.45% |
33
+ | Q3_K_S | 28.79 | 3.6502 +/- 0.0192 | 28.95% |
34
+ | IQ3_S | 28.79 | 3.4698 +/- 0.0174 | 22.57% |
35
+ | IQ3_M | 29.74 | 3.4402 +/- 0.0171 | 21.53% |
36
+ | Q3_K_M | 31.91 | 3.3617 +/- 0.0172 | 18.75% |
37
+ | Q3_K_L | 34.59 | 3.3016 +/- 0.0168 | 16.63% |
38
+ | IQ4_XS | 35.30 | 3.0310 +/- 0.0149 | 7.07% |
39
+ | IQ4_NL | 37.30 | 3.0261 +/- 0.0149 | 6.90% |
40
+ | Q4_K_S | 37.58 | 3.0050 +/- 0.0148 | 6.15% |
41
+ | Q4_K_M | 39.60 | 2.9674 +/- 0.0146 | 4.83% |
42
+ | Q5_K_S | 45.32 | 2.8843 +/- 0.0141 | 1.89% |
43
+ | Q5_K_M | 46.52 | 2.8656 +/- 0.0139 | 1.23% |
44
+ | Q6_K | 53.91 | 2.8441 +/- 0.0138 | 0.47% |
45
+ | Q8_0 | 69.83 | 2.8316 +/- 0.0138 | 0.03% |
46
+ | F16 | 131.43 | 2.8308 +/- 0.0138 | 0.00% |
47
+
48
  Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model [README](https://github.com/meta-llama/llama3). For more technical information about generation parameters and recipes for how to use Llama 3 in applications, please go [here](https://github.com/meta-llama/llama-recipes).
49
 
50
  ## License