Update README.md
Browse files
README.md
CHANGED
@@ -32,7 +32,7 @@ quantized_by: Thireus
|
|
32 |
|
33 |
\*\* Evaluated with text-generation-webui ExLlama v0.0.2 on wikitext-2-raw-v1 (stride 512 and max_length 0). For reference, [TheBloke_WizardLM-70B-V1.0-GPTQ_gptq-4bit-32g-actorder_True](https://huggingface.co/TheBloke/WizardLM-70B-V1.0-GPTQ/tree/gptq-4bit-32g-actorder_True) has a score of 4.1015625 in perplexity.
|
34 |
|
35 |
-
\*\*\* Without Flash Attention - For VRAM optimisation, make sure you install https://github.com/Dao-AILab/flash-attention#installation-and-features
|
36 |
|
37 |
## Description:
|
38 |
|
|
|
32 |
|
33 |
\*\* Evaluated with text-generation-webui ExLlama v0.0.2 on wikitext-2-raw-v1 (stride 512 and max_length 0). For reference, [TheBloke_WizardLM-70B-V1.0-GPTQ_gptq-4bit-32g-actorder_True](https://huggingface.co/TheBloke/WizardLM-70B-V1.0-GPTQ/tree/gptq-4bit-32g-actorder_True) has a score of 4.1015625 in perplexity.
|
34 |
|
35 |
+
\*\*\* Without Flash Attention - For better VRAM optimisation, make sure you install https://github.com/Dao-AILab/flash-attention#installation-and-features
|
36 |
|
37 |
## Description:
|
38 |
|