qwp4w3hyb commited on
Commit
aa1e9c8
1 Parent(s): 7ee0387

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -19,6 +19,11 @@ license_link: LICENSE
19
 
20
  # Quant Infos
21
 
 
 
 
 
 
22
  Quantized with [llama.cpp](https://github.com/ggerganov/llama.cpp) commit with tokenizer fixes from [this](https://github.com/ggerganov/llama.cpp/pull/6745) branch cherry-picked [0d56246f4b9764158525d894b96606f6163c53a8](https://github.com/ggerganov/llama.cpp/commit/0d56246f4b9764158525d894b96606f6163c53a8) (master from 2024-04-18)
23
 
24
  Imatrix dataset was used from [here](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384)
 
19
 
20
  # Quant Infos
21
 
22
+ - quants done with an importance matrix for improved quantization loss
23
+ - K & IQ quants in basically all variants
24
+ - fixed end token for instruct mode (<|eot_id|>[128009])
25
+ - files larger than 50GB were split using the gguf-split utility, just download all parts and point llama.cpp to the first one (00001-of-x)
26
+
27
  Quantized with [llama.cpp](https://github.com/ggerganov/llama.cpp) commit with tokenizer fixes from [this](https://github.com/ggerganov/llama.cpp/pull/6745) branch cherry-picked [0d56246f4b9764158525d894b96606f6163c53a8](https://github.com/ggerganov/llama.cpp/commit/0d56246f4b9764158525d894b96606f6163c53a8) (master from 2024-04-18)
28
 
29
  Imatrix dataset was used from [here](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384)