bartowski
/

Qwen2-7B-Instruct-GGUF

Text Generation

Model card Files Files and versions Community

bartowski commited on 23 days ago

Commit

ff7d490

•

1 Parent(s): 994e8ff

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -8,6 +8,8 @@ tags:
 quantized_by: bartowski
 ---
 ## Llamacpp imatrix Quantizations of Qwen2-7B-Instruct
 Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> commit <a href="https://github.com/ggerganov/llama.cpp/commit/ee459f40f65810a810151b24eba5b8bd174ceffe">ee459f40f65810a810151b24eba5b8bd174ceffe</a> for quantization.

 quantized_by: bartowski
 ---
+# <b>Heads up:</b> currently CUDA offloading is broken unless you enable flash attention
 ## Llamacpp imatrix Quantizations of Qwen2-7B-Instruct
 Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> commit <a href="https://github.com/ggerganov/llama.cpp/commit/ee459f40f65810a810151b24eba5b8bd174ceffe">ee459f40f65810a810151b24eba5b8bd174ceffe</a> for quantization.