JustinLin610 commited on
Commit
e2ff9a3
1 Parent(s): 07800fc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -1
README.md CHANGED
@@ -22,7 +22,18 @@ Qwen1.5 is the beta version of Qwen2, a transformer-based decoder-only language
22
  * Stable support of 32K context length for models of all sizes
23
  * No need of `trust_remote_code`.
24
 
25
- For more details, please refer to our [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5). In this repo, we provide the `q8_0` quantized model in the GGUF format.
 
 
 
 
 
 
 
 
 
 
 
26
  <br>
27
 
28
  ## Model Details
 
22
  * Stable support of 32K context length for models of all sizes
23
  * No need of `trust_remote_code`.
24
 
25
+ For more details, please refer to our [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5).
26
+ In this repo, we provide quantized models in the GGUF formats, including `q2_k`, `q3_k_m`, `q4_0`, `q4_k_m`, `q5_0`, `q5_k_m`, `q6_k` and `q8_0`.
27
+ To demonstrate their model quality, we follow [`llama.cpp`](https://github.com/ggerganov/llama.cpp) to evaluate their perplexity on wiki test set. Results are shown below:
28
+
29
+ |Size | fp16 | q8_0 | q6_k | q5_k_m | q5_0 | q4_k_m | q4_0 | q3_k_m | q2_k |
30
+ |--------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
31
+ |0.5B | 34.20 | 34.22 | 34.31 | 33.80 | 34.02 | 34.27 | 36.74 | 38.25 | 62.14 |
32
+ |1.8B | 15.99 | 15.99 | 15.99 | 16.09 | 16.01 | 16.22 | 16.54 | 17.03 | 19.99 |
33
+ |4B | 13.20 | 13.21 | 13.28 | 13.24 | 13.27 | 13.61 | 13.44 | 13.67 | 15.65 |
34
+ |7B | 14.21 | 14.24 | 14.35 | 14.32 | 14.12 | 14.35 | 14.47 | 15.11 | 16.57 |
35
+ |14B | 10.91 | 10.91 | 10.93 | 10.88 | 10.88 | 10.92 | 10.92 | 11.24 | 12.27 |
36
+ |72B | 7.97 | 7.99 | 7.99 | 7.99 | 8.01 | 8.00 | 8.01 | 8.06 | 8.63 |
37
  <br>
38
 
39
  ## Model Details