gguf
#17
by
daisr
- opened
gguf, pls
Ollama, pls :)
its out there if you search.
ok its not on gguf yet as it cannot be converted so easy ?
llama.cpp doesn't support this model yet
Try the exl2 quants, they work, I'm using Turboderps 8.0bpw version and can run it on Text generation webui with 128k context (at 8bit cache) within 24gb of gpu memory. It's a good model.
it should be fine now !!
itsin unsloth and in the llama cpp ( they had to update the embeddings)
Llama.cpp PR got merged