JustinLin610
commited on
Commit
•
e2ff9a3
1
Parent(s):
07800fc
Update README.md
Browse files
README.md
CHANGED
@@ -22,7 +22,18 @@ Qwen1.5 is the beta version of Qwen2, a transformer-based decoder-only language
|
|
22 |
* Stable support of 32K context length for models of all sizes
|
23 |
* No need of `trust_remote_code`.
|
24 |
|
25 |
-
For more details, please refer to our [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
26 |
<br>
|
27 |
|
28 |
## Model Details
|
|
|
22 |
* Stable support of 32K context length for models of all sizes
|
23 |
* No need of `trust_remote_code`.
|
24 |
|
25 |
+
For more details, please refer to our [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5).
|
26 |
+
In this repo, we provide quantized models in the GGUF formats, including `q2_k`, `q3_k_m`, `q4_0`, `q4_k_m`, `q5_0`, `q5_k_m`, `q6_k` and `q8_0`.
|
27 |
+
To demonstrate their model quality, we follow [`llama.cpp`](https://github.com/ggerganov/llama.cpp) to evaluate their perplexity on wiki test set. Results are shown below:
|
28 |
+
|
29 |
+
|Size | fp16 | q8_0 | q6_k | q5_k_m | q5_0 | q4_k_m | q4_0 | q3_k_m | q2_k |
|
30 |
+
|--------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
|
31 |
+
|0.5B | 34.20 | 34.22 | 34.31 | 33.80 | 34.02 | 34.27 | 36.74 | 38.25 | 62.14 |
|
32 |
+
|1.8B | 15.99 | 15.99 | 15.99 | 16.09 | 16.01 | 16.22 | 16.54 | 17.03 | 19.99 |
|
33 |
+
|4B | 13.20 | 13.21 | 13.28 | 13.24 | 13.27 | 13.61 | 13.44 | 13.67 | 15.65 |
|
34 |
+
|7B | 14.21 | 14.24 | 14.35 | 14.32 | 14.12 | 14.35 | 14.47 | 15.11 | 16.57 |
|
35 |
+
|14B | 10.91 | 10.91 | 10.93 | 10.88 | 10.88 | 10.92 | 10.92 | 11.24 | 12.27 |
|
36 |
+
|72B | 7.97 | 7.99 | 7.99 | 7.99 | 8.01 | 8.00 | 8.01 | 8.06 | 8.63 |
|
37 |
<br>
|
38 |
|
39 |
## Model Details
|