Failed to load

by Priderock - opened Feb 23, 2024

Feb 23, 2024

llama_model_loader: - type f32: 37 tensors
llama_model_loader: - type q8_0: 127 tensors
error loading model: unknown model architecture: 'gemma'
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model 'gemma-2b-it-q8_0.gguf'
{"timestamp":1708650706,"level":"ERROR","function":"load_model","line":590,"message":"unable to load model","model":"gemma-2b-it-q8_0.gguf"}

mlabonne

Owner Feb 23, 2024

•

edited Feb 23, 2024

Sorry, I don't think you're using this repo because I haven't uploaded the Q8_0 version yet! :)

DKingg

Feb 24, 2024

llm_load_tensors: using CUDA for GPU acceleration
error loading model: create_tensor: tensor 'output.weight' not found
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model '/ai/llama/models/gemma/gemma-7b-it.Q5_K_M.gguf'
main: error: unable to load model

mlabonne

Owner Feb 25, 2024

Just pushed a new version that solves your problem @DKingg (and overall good perplexity).

totoro191

Mar 15, 2024

Just pushed a new version that solves your problem @DKingg (and overall good perplexity).

hello bro,can you share you convert method here? because I use llama.cpp to convert gemma-7b-it list this: python convert.py /home/jovyan/share-pvc-dutianwei/models/huggingface/google/gemma-7b-it,it generate a file named ggml-model-f16.gguf, everything looks good ,but when I load with

./server
- '-m'
- >-
/home/jovyan/models/models/huggingface/google/gemma-7b-it/ggml-model-f16.gguf
- '-c'
- '4096'
- '--host'
- 0.0.0.0
- '--port'
- '8000'
- '-ngl'
- '100'

it shows error like llama_model_load: error loading model: create_tensor: tensor 'output.weight' not found
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model '/home/jovyan/models/models/huggingface/google/gemma-7b-it/ggml-model-f16.gguf'
terminate called without an active exception
{"timestamp":1710482521,"level":"ERROR","function":"load_model","line":375,"message":"unable to load model","model":"/home/jovyan/models/models/huggingface/google/gemma-7b-it/ggml-model-f16.gguf"},how to solve it ?please give me some help, thanks~~~

mlabonne

Owner Mar 15, 2024

@totoro191 Sure, the fix should be simple. You need to use convert-hf-to-gguf.py instead of convert.py to create the f16 model.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment