Update README.md
Browse files
README.md
CHANGED
@@ -19,6 +19,11 @@ license_link: LICENSE
|
|
19 |
|
20 |
# Quant Infos
|
21 |
|
|
|
|
|
|
|
|
|
|
|
22 |
Quantized with [llama.cpp](https://github.com/ggerganov/llama.cpp) commit with tokenizer fixes from [this](https://github.com/ggerganov/llama.cpp/pull/6745) branch cherry-picked [0d56246f4b9764158525d894b96606f6163c53a8](https://github.com/ggerganov/llama.cpp/commit/0d56246f4b9764158525d894b96606f6163c53a8) (master from 2024-04-18)
|
23 |
|
24 |
Imatrix dataset was used from [here](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384)
|
|
|
19 |
|
20 |
# Quant Infos
|
21 |
|
22 |
+
- quants done with an importance matrix for improved quantization loss
|
23 |
+
- K & IQ quants in basically all variants
|
24 |
+
- fixed end token for instruct mode (<|eot_id|>[128009])
|
25 |
+
- files larger than 50GB were split using the gguf-split utility, just download all parts and point llama.cpp to the first one (00001-of-x)
|
26 |
+
|
27 |
Quantized with [llama.cpp](https://github.com/ggerganov/llama.cpp) commit with tokenizer fixes from [this](https://github.com/ggerganov/llama.cpp/pull/6745) branch cherry-picked [0d56246f4b9764158525d894b96606f6163c53a8](https://github.com/ggerganov/llama.cpp/commit/0d56246f4b9764158525d894b96606f6163c53a8) (master from 2024-04-18)
|
28 |
|
29 |
Imatrix dataset was used from [here](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384)
|