bartowski commited on
Commit
13c6e83
1 Parent(s): 4e83ff2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -10,9 +10,11 @@ tags:
10
  quantized_by: bartowski
11
  ---
12
 
 
 
13
  ## Llamacpp imatrix Quantizations of Qwen2-72B-Instruct
14
 
15
- Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/master">master</a> for quantization.
16
 
17
  Original model: https://huggingface.co/Qwen/Qwen2-72B-Instruct
18
 
 
10
  quantized_by: bartowski
11
  ---
12
 
13
+ # <b>Heads up:</b> currently CUDA offloading is broken unless you enable flash attention
14
+
15
  ## Llamacpp imatrix Quantizations of Qwen2-72B-Instruct
16
 
17
+ Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> commit <a href="https://github.com/ggerganov/llama.cpp/commit/ee459f40f65810a810151b24eba5b8bd174ceffe">ee459f40f65810a810151b24eba5b8bd174ceffe</a> for quantization.
18
 
19
  Original model: https://huggingface.co/Qwen/Qwen2-72B-Instruct
20