Text Generation
Transformers
GGUF
English
stablelm
causal-lm
Inference Endpoints

doesn't work with llama.cpp

#1
by vasilee - opened
llama_model_load: error loading model: done_getting_tensors: wrong number of tensors; expected 340, got 268
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model '../models/stablelm-2-zephyr-1_6b-Q4_K_M.gguf'
main: error: unable to load model
Second State org

The gguf models are generated with https://github.com/ggerganov/llama.cpp/pull/5052. You can try it again with this PR.

Download latest release or build from source it will work.

Sign up or log in to comment