Apr 28

I use llama_cpp_python to load this model,but this failed.
model name:ggml-dbrx-instruct-16x12b-iq3
I download ggml-dbrx-instruct-16x12b-iq3_xs-00001-of-00002.gguf && ggml-dbrx-instruct-16x12b-iq3_xs-00002-of-00002.gguf,
use cat to conbine model,but load model by llama_cpp_python failed.
the Error info look likes the file is wrong, Looking forward to hearing from you!

"""

python

from llama_cpp import Llama
llm = Llama(
model_path="/home/server/python/models/models--dranger003--dbrx-instruct-iMat.GGUF/snapshots/f85aad8d997fed7828cc055e418d14d9ddefcc33/ggml-dbrx-instruct-16x12b-iq3.gguf",
n_gpu_layers=-1, # Uncomment to use GPU acceleration
# seed=1337, # Uncomment to set a specific seed
# n_ctx=2048, # Uncomment to increase the context window
# embedding=True
)
"""
output:

llama_model_load: error loading model: invalid split file: /home/server/python/models/models--dranger003--dbrx-instruct-iMat.GGUF/snapshots/f85aad8d997fed7828cc055e418d14d9ddefcc33/ggml-dbrx-instruct-16x12b-iq3.gguf
llama_load_model_from_file: failed to load model

ValueError Traceback (most recent call last)
/home/server/python/main.ipynb 单元格 2 line 2
1 from llama_cpp import Llama
----> 2 llm = Llama(
3 model_path="/home/server/python/models/models--dranger003--dbrx-instruct-iMat.GGUF/snapshots/f85aad8d997fed7828cc055e418d14d9ddefcc33/ggml-dbrx-instruct-16x12b-iq3.gguf",
4 n_gpu_layers=-1, # Uncomment to use GPU acceleration
5 # seed=1337, # Uncomment to set a specific seed
6 # n_ctx=2048, # Uncomment to increase the context window
7 # embedding=True
8 )

File ~/anaconda3/lib/python3.11/site-packages/llama_cpp/llama.py:322, in Llama.init(self, model_path, n_gpu_layers, split_mode, main_gpu, tensor_split, vocab_only, use_mmap, use_mlock, kv_overrides, seed, n_ctx, n_batch, n_threads, n_threads_batch, rope_scaling_type, pooling_type, rope_freq_base, rope_freq_scale, yarn_ext_factor, yarn_attn_factor, yarn_beta_fast, yarn_beta_slow, yarn_orig_ctx, logits_all, embedding, offload_kqv, last_n_tokens_size, lora_base, lora_scale, lora_path, numa, chat_format, chat_handler, draft_model, tokenizer, type_k, type_v, verbose, **kwargs)
319 if not os.path.exists(model_path):
320 raise ValueError(f"Model path does not exist: {model_path}")
--> 322 self._model = LlamaModel(
323 path_model=self.model_path, params=self.model_params, verbose=self.verbose
324 )
326 # Override tokenizer
327 self.tokenizer = tokenizer or LlamaTokenizer(self)

File ~/anaconda3/lib/python3.11/site-packages/llama_cpp/_internals.py:55, in _LlamaModel.init(self, path_model, params, verbose)
50 self.model = llama_cpp.llama_load_model_from_file(
51 self.path_model.encode("utf-8"), self.params
52 )
54 if self.model is None:
---> 55 raise ValueError(f"Failed to load model from file: {path_model}")

ValueError: Failed to load model from file: /home/server/python/models/models--dranger003--dbrx-instruct-iMat.GGUF/snapshots/f85aad8d997fed7828cc055e418d14d9ddefcc33/ggml-dbrx-instruct-16x12b-iq3.gguf

dranger003

Owner Apr 28

@Tianyi000 Hey there, do not merge splits using concatenation, that won't work. Take a look at the model page there is the command for merging, but this shouldn't be needed since you could just specify the first split filename as the model file.

Tianyi000

Apr 29

@dranger003 Thank you for your reply. I apologize for not carefully reading the model introduction. Thank you again for sharing.

dranger003
/

dbrx-instruct-iMat.GGUF

can't use llama load gguf model

python