TheBloke/CausalLM-14B-GGUF · No Q

TheYuriLover

Oct 22, 2023

Title

KerfuffleV2

Oct 23, 2023

•

edited Oct 23, 2023

The tensors have to be a multiple of the k-quants block size to use k-quants. LLaMA models usually fit that requirements, but the 14B here doesn't. (Technically there's a way to use k-quants anyway but it requires compiling with a special flag to quantize and load the models and you lose some of the advantage of k-quants that way also.)
There is also another issue with the conversion where the BPE merges didn't get added to the GGUF files (both 7b and 14b as far as I know) so you can't load the models. This is not TB's fault. But I suggest waiting for a fixed version before trying to download them. Associated GitHub issue: https://github.com/ggerganov/llama.cpp/issues/3732 edit: Should be fixed now.

HDiffusion

Oct 23, 2023

There's a pr to change this behavior https://github.com/ggerganov/llama.cpp/pull/3747

TheBloke
/

CausalLM-14B-GGUF

No Q_K quants?