llama.cpp cannot load Q6_K model

by vmajor - opened 1 day ago

1 day ago

Loading stops at the metadata dump

... llm_load_print_meta: expert_weights_norm = 1 llm_load_print_meta: expert_gating_func = sigmoid llm_load_print_meta: rope_yarn_log_mul = 0.1000

using the latest llama.cpp on 'main'. No error is reported.

shimmyshimmer

Unsloth AI org 1 day ago

Loading stops at the metadata dump

... llm_load_print_meta: expert_weights_norm = 1 llm_load_print_meta: expert_gating_func = sigmoid llm_load_print_meta: rope_yarn_log_mul = 0.1000

using the latest llama.cpp on 'main'. No error is reported.

Does the same error occur for the other quants?

danielhanchen

Unsloth AI org 1 day ago

Sometimes you need to wait quite a while for it to load into system RAM - I would use top or Task Manager to see if the process is loading it in

vmajor

1 day ago

•

edited 1 day ago

...I haven't tried. The model and quants are rather large so I was hoping for "it is a known error, you need to use this during startup", or something along those lines...

I will try a smaller quant and will let you know how I go.

EDIT: yes of course. htop shows no activity. No IO, no memory being increasingly depleted and no CPU load. Nothing at all is happening related to loading the model.

vmajor

1 day ago

•

edited 1 day ago

I just tried it with:

./llama-cli --model /models/DeepSeek-V3-GGUF/DeepSeek-V3-Q2_K_L/DeepSeek-V3-Q2_K_L-00001-of-00005.gguf --cache-type-k q5_0 --prompt '<｜User｜>What is 1+1?<｜Assistant｜>'

...and it also does not load, or begin loading. This workstation has 256 GB RAM and 44 cores so it should be able to run this model without any significant effort.

Just to test that something unusual isn't broken, I loaded other models (not DeepSeek, or Unsloth) without issues.

EDIT: belay that... something is happening. llama.cpp is hammering a single core at a time and RAM is slowly filling up! And edit again, it works:

What is 1+1?Solution:

To find the value of (1 + 1), we can perform the addition step by step.

[
1 + 1 = 2
]

Final Answer:

[
\boxed{2}
] [end of text]

vmajor changed discussion status to closed 1 day ago

vmajor

1 day ago

The problem seems to be limited to Q6_K on my system. I have no resources/time to keep testing it.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

llama.cpp cannot load Q6_K model

[\boxed{2}] [end of text]

[
\boxed{2}
] [end of text]