Error loading model

#1
by dduval - opened

When trying to load nous-hermes-llama-2-7b.ggmlv3.q5_K_M.bin with koboldcpp, or llama.cpp server.exe, I get the following error:

error loading model: llama.cpp: tensor 'tok_embeddings.weight' has wrong shape; expected 4096 x 32032, got 4096 x 32000
llama_load_model_from_file: failed to load model

I guess it's because of 32032 for n_vocab which is unusual.

ah shit. Yeah it is actually 32000, not 32032. It was incorrectly labelled as 32032 at first in the upstream repo. The files converted fine so I thought they were fine but it must have set bad metadata

I'll have to do them again I guess. I'll start that now

The fixed ones are uploading now. Refresh Files and Versions tab and wait until the one you want shows a new upload, then you can grab it. I tested them and they work fine now.

Yeah, I've noticed they're updating. That was so fast I did not have time to make coffee, so I grabbed a beer instead. Cheers and many thanks! You're the best.

Sign up or log in to comment