error: failed to load model

#2
by Elfrino - opened

I'm trying to run the model with KoboldCPP but I get this error:


Using automatic RoPE scaling. If the model has customized RoPE settings, they will be used directly instead!
System Info: AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 |
llama_model_loader: loaded meta data with 27 key-value pairs and 322 tensors from C:\Users\Rich\Desktop\AI\Text\c4ai-command-r-v01-Q5_K_M.gguf (version GGUF V3 (latest))
llama_model_load: error loading model: error loading model vocabulary: unknown pre-tokenizer type: 'command-r'
llama_load_model_from_file: failed to load model
Traceback (most recent call last):
File "koboldcpp.py", line 3330, in
File "koboldcpp.py", line 3073, in main
File "koboldcpp.py", line 396, in load_model
OSError: exception: access violation reading 0x0000000000000070
[17976] Failed to execute script 'koboldcpp' due to unhandled exception!


I'm not sure if it's the model or if I'm doing something wrong?

Just tried it in LM Studio and I get a similar error:


"llama.cpp error: 'error loading model vocabulary: unknown pre-tokenizer type: 'command-r''"

Koboldcpp is missing this commit from upstream in the main branch:

Add the missing code blocks to llama.cpp and llama.h and recompile it.

or just wait for the next release.

Koboldcpp is missing this commit from upstream in the main branch:

Add the missing code blocks to llama.cpp and llama.h and recompile it.

or just wait for the next release.

Thanks. I'll wait a bit. :)

Jobaar changed discussion status to closed

Sign up or log in to comment