I have now tried two quantizations 8_0, and 6_K, they both fail like you see below.
#2
by
BigDeeper
- opened
~/ollama/ollama run phi-3-mini-128k-instruct.Q6_K
Error: llama runner process no longer running: -1
microsoft/Phi-3-mini-4k-instruct-gguf does not cause the same error.
See the relevant github issue here:
https://github.com/ggerganov/llama.cpp/issues/6849
Quants have been updated with the latest release for llama.cpp
munish0838
changed discussion status to
closed