How soon until GGML?

#2
by rombodawg - opened

I see the link to the GGML model but its a broken link, does this mean the ggml version is in the works?

That was a mistake. I forgot to disable GGML when configuring this quant. I deleted the repo but forgot to update this. I've removed the link for now

I may try 70B GGML shortly. The llama.cpp PR seems to be working on CPU only. So I could put out some files marked experimental.

To be clear, they will only work with llama.cpp on the command line, not any of the normal GGML UIs (text-generation-webui, LM Studio, KoboldCpp, etc) and you will have to compile it from source yourself. And they will be slow as hell due to being CPU only, and you will probably need 64GB RAM.

Fair enough

I am using console anyway for ggml models ;)
Also only for CPU is not concern for me as well as I have cpu 7950x3d and getting on cpu almost 2t/s with 65b models.

I tried the llama.cpp CPU-only support last night and I couldn't get it working. Watch this for updates: https://github.com/ggerganov/llama.cpp/pull/2276#issuecomment-1646328190

Sign up or log in to comment