Update README.md
Browse files
README.md
CHANGED
@@ -8,6 +8,8 @@ tags:
|
|
8 |
quantized_by: bartowski
|
9 |
---
|
10 |
|
|
|
|
|
11 |
## Llamacpp imatrix Quantizations of Qwen2-7B-Instruct
|
12 |
|
13 |
Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> commit <a href="https://github.com/ggerganov/llama.cpp/commit/ee459f40f65810a810151b24eba5b8bd174ceffe">ee459f40f65810a810151b24eba5b8bd174ceffe</a> for quantization.
|
|
|
8 |
quantized_by: bartowski
|
9 |
---
|
10 |
|
11 |
+
# <b>Heads up:</b> currently CUDA offloading is broken unless you enable flash attention
|
12 |
+
|
13 |
## Llamacpp imatrix Quantizations of Qwen2-7B-Instruct
|
14 |
|
15 |
Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> commit <a href="https://github.com/ggerganov/llama.cpp/commit/ee459f40f65810a810151b24eba5b8bd174ceffe">ee459f40f65810a810151b24eba5b8bd174ceffe</a> for quantization.
|