Can't load model in LlamaCpp

#4
by ThoilGoyang - opened

I'm using LlamaCppfor loading the model in the jupyter, but i'm getting these error :
llama_model_load: error loading model: error loading model vocabulary: unknown pre-tokenizer type: 'command-r'
llama_load_model_from_file: failed to load model
ValidationError: 1 validation error for LlamaCpp root
Could not load Llama model from path: ../../LLM/aya-23-8B-IQ4_NL.gguf. Received error Failed to load model from file: ../../LLM/aya-23-8B-IQ4_NL.gguf (type=value_error)

Here's my code :

n_gpu_layers = -1
n_batch = 4096
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])

llm = LlamaCpp(
model_path="../../LLM/aya-23-8B-IQ4_NL.gguf",
n_gpu_layers=n_gpu_layers,
n_batch=n_batch,
temperature=0,
callback_manager=callback_manager,
verbose=True, # Verbose is required to pass to the callback manager
max_token = 4096,
n_ctx = 4096
)

i already update the package of llama-cpp-python, langchain.
have any suggestion?

Can you confirm what hardware it's running on and/or try a K quant?

i'm running on gcp workbench n1 machine, t4 gpu. also i already tried the Q8_0.
if i load the other models such as mistral, llama3, llama2 everything is alright.

@bartowski I'm also having this issue using LM Studio (also based on llama.cpp). I get: "llama.cpp error: 'error loading model vocabulary: unknown pre-tokenizer type: 'command-r''" .
I have Mac M1 with 32GB RAM.

I updated LM Studio version to 0.2.24 and the issue resolved! ☺️

I can use it on LM Studio but i cant use it in Llamacpp in python.

Yeah so that implies to me llamacpp python is somehow still on an older unsupported version.. but the most recent is from 5 days ago, well after support was added..

So it is fixed by reinstalling the lib using GPU.

Sign up or log in to comment