Lewdiculous
/

Poppy_Porpoise-1.0-L3-8B-GGUF-IQ-Imatrix

Model card Files Files and versions Community

Lewdiculous commited on May 26

Commit

1b5d05a

•

1 Parent(s): bbe10e2

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -24,9 +24,9 @@ My GGUF-IQ-Imatrix quants for [**Nitral-AI/Poppy_Porpoise-0.85-L3-8B**](https://
 > [!NOTE]
 > **General usage:** <br>
 > Use the latest version of **KoboldCpp**. <br>
 > For **8GB VRAM** GPUs, I recommend the **Q4_K_M-imat** quant for up to 12288 context sizes. <br>
 > For **12GB VRAM** GPUs, the **Q5_K_M-imat** quant will give you a great size/quality balance. <br>
-> Remember that you can also use `--flashattention` on KoboldCpp now even with non-RTX cards for reduced VRAM usage.
 >
 > **Resources:** <br>
 > You can find out more about how each quant stacks up against each other and their types [**here**](gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9) and [**here**](https://rentry.org/llama-cpp-quants-or-fine-ill-do-it-myself-then-pt-2), respectively.

 > [!NOTE]
 > **General usage:** <br>
 > Use the latest version of **KoboldCpp**. <br>
+> Remember that you can also use `--flashattention` on KoboldCpp now even with non-RTX cards for reduced VRAM usage. <br>
 > For **8GB VRAM** GPUs, I recommend the **Q4_K_M-imat** quant for up to 12288 context sizes. <br>
 > For **12GB VRAM** GPUs, the **Q5_K_M-imat** quant will give you a great size/quality balance. <br>
 >
 > **Resources:** <br>
 > You can find out more about how each quant stacks up against each other and their types [**here**](gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9) and [**here**](https://rentry.org/llama-cpp-quants-or-fine-ill-do-it-myself-then-pt-2), respectively.