InferenceIllusionist commited on
Commit
1b17f50
1 Parent(s): b03731a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -11,7 +11,7 @@ tags:
11
 
12
 
13
  <b>Special request.</b> Quantized from fp32 with love. If you can't fit IQ quants in your VRAM, try using the K quants in this repo instead.
14
- * The [.imatrix](https://huggingface.co/InferenceIllusionist/Llama-3-70B-Instruct-Storywriter-iMat-GGUF/resolve/main/Llama-3-70B-Instruct-Storywriter.imatrix?download=true) file in this repo was created using this [process](https://huggingface.co/jukofyork/WizardLM-2-8x22B-imatrix)
15
  * Calculated in 88 chunks with n_ctx=512 using groups_merged.txt
16
 
17
  For a brief rundown of iMatrix quant performance please see this [PR](https://github.com/ggerganov/llama.cpp/pull/5747)
 
11
 
12
 
13
  <b>Special request.</b> Quantized from fp32 with love. If you can't fit IQ quants in your VRAM, try using the K quants in this repo instead.
14
+ * The [.imatrix](https://huggingface.co/InferenceIllusionist/Llama-3-70B-Instruct-Storywriter-iMat-GGUF/resolve/main/Llama-3-70B-Instruct-Storywriter.imatrix?download=true) file in this repo was created using the Q8_0 quantization of Llama-3-70B-Instruct-Storywriter-iMat-GGUF.
15
  * Calculated in 88 chunks with n_ctx=512 using groups_merged.txt
16
 
17
  For a brief rundown of iMatrix quant performance please see this [PR](https://github.com/ggerganov/llama.cpp/pull/5747)