InferenceIllusionist commited on
Commit
d14111e
·
verified ·
1 Parent(s): 3107ac6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -29,7 +29,7 @@ PROUDLY PRESENTS
29
  <b>Quantization Note: Use repetition penalty (--repeat-penalty on llama.cpp) of 1.05 - 1.15 for best results </b>
30
 
31
  Quantized from fp16 with love.
32
- * Weighted quantizations were creating using fp16 GGUF and [groups_merged-enhancedV2-TurboMini.txt](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-9432658) in 189 chunks and n_ctx=512
33
  * This method of calculating the importance matrix showed improvements in some areas for Mistral 7b and Llama3 8b models, see above post for details
34
  * The enhancedv2-turbomini file appends snippets from turboderp's calibration data to the standard groups_merged.txt file
35
 
 
29
  <b>Quantization Note: Use repetition penalty (--repeat-penalty on llama.cpp) of 1.05 - 1.15 for best results </b>
30
 
31
  Quantized from fp16 with love.
32
+ * Weighted quantizations were created using fp16 GGUF and [groups_merged-enhancedV2-TurboMini.txt](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-9432658) in 189 chunks and n_ctx=512
33
  * This method of calculating the importance matrix showed improvements in some areas for Mistral 7b and Llama3 8b models, see above post for details
34
  * The enhancedv2-turbomini file appends snippets from turboderp's calibration data to the standard groups_merged.txt file
35