Can't inference

#1
by Tibbnak - opened

When trying to inference the Q4_K_M gguf with llama.cpp, latest compiled server.exe

llama_model_load: error loading model: create_tensor: tensor 'blk.0.attn_q.weight' has wrong shape; expected 3072, 3072, got 3072, 4096, 1, 1
llama_load_model_from_file: failed to load model

hmm, getting same issue, let me investigate

Sign up or log in to comment