InferenceIllusionist
commited on
Commit
•
3e7d289
1
Parent(s):
85c2b50
Update README.md
Browse files
README.md
CHANGED
@@ -28,7 +28,7 @@ PROUDLY PRESENTS
|
|
28 |
|
29 |
|
30 |
Quantized from fp16 with love.
|
31 |
-
* Weighted
|
32 |
|
33 |
For a brief rundown of iMatrix quant performance please see this [PR](https://github.com/ggerganov/llama.cpp/pull/5747)
|
34 |
|
|
|
28 |
|
29 |
|
30 |
Quantized from fp16 with love.
|
31 |
+
* Weighted quantizations were calculated using groups_merged.txt with 105 chunks (recommended amount for this file) and n_ctx=512. Special thanks to jukofyork for sharing [this process](https://huggingface.co/jukofyork/WizardLM-2-8x22B-imatrix)
|
32 |
|
33 |
For a brief rundown of iMatrix quant performance please see this [PR](https://github.com/ggerganov/llama.cpp/pull/5747)
|
34 |
|