Lewdiculous commited on
Commit
1b5d05a
1 Parent(s): bbe10e2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -24,9 +24,9 @@ My GGUF-IQ-Imatrix quants for [**Nitral-AI/Poppy_Porpoise-0.85-L3-8B**](https://
24
  > [!NOTE]
25
  > **General usage:** <br>
26
  > Use the latest version of **KoboldCpp**. <br>
 
27
  > For **8GB VRAM** GPUs, I recommend the **Q4_K_M-imat** quant for up to 12288 context sizes. <br>
28
  > For **12GB VRAM** GPUs, the **Q5_K_M-imat** quant will give you a great size/quality balance. <br>
29
- > Remember that you can also use `--flashattention` on KoboldCpp now even with non-RTX cards for reduced VRAM usage.
30
  >
31
  > **Resources:** <br>
32
  > You can find out more about how each quant stacks up against each other and their types [**here**](gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9) and [**here**](https://rentry.org/llama-cpp-quants-or-fine-ill-do-it-myself-then-pt-2), respectively.
 
24
  > [!NOTE]
25
  > **General usage:** <br>
26
  > Use the latest version of **KoboldCpp**. <br>
27
+ > Remember that you can also use `--flashattention` on KoboldCpp now even with non-RTX cards for reduced VRAM usage. <br>
28
  > For **8GB VRAM** GPUs, I recommend the **Q4_K_M-imat** quant for up to 12288 context sizes. <br>
29
  > For **12GB VRAM** GPUs, the **Q5_K_M-imat** quant will give you a great size/quality balance. <br>
 
30
  >
31
  > **Resources:** <br>
32
  > You can find out more about how each quant stacks up against each other and their types [**here**](gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9) and [**here**](https://rentry.org/llama-cpp-quants-or-fine-ill-do-it-myself-then-pt-2), respectively.