Model won't load
#1
by
nonetrix
- opened
[noah@ai]~/Documents/AI/text-generation-webui% ./start_linux.sh
22:10:49-154599 INFO Starting Text generation web UI
22:10:49-157176 INFO Loading the extension "gallery"
Running on local URL: http://127.0.0.1:7860
To create a public link, set `share=True` in `launch()`.
22:10:59-772999 INFO Loading ggml-model-Q8_0.gguf
22:10:59-875170 INFO llama.cpp weights detected: models/ggml-model-Q8_0.gguf
ggml_init_cublas: GGML_CUDA_FORCE_MMQ: no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 ROCm devices:
Device 0: AMD Radeon RX 6800, compute capability 10.3
llama_model_loader: loaded meta data with 22 key-value pairs and 444 tensors from models/ggml-model-Q8_0.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = orion
llama_model_loader: - kv 1: general.file_type u32 = 7
llama_model_loader: - kv 2: general.name str = Orion-14B-Chat
llama_model_loader: - kv 3: orion.tensor_data_layout str = Meta AI original pth
llama_model_loader: - kv 4: orion.context_length u32 = 4096
llama_model_loader: - kv 5: orion.embedding_length u32 = 5120
llama_model_loader: - kv 6: orion.block_count u32 = 40
llama_model_loader: - kv 7: orion.feed_forward_length u32 = 15360
llama_model_loader: - kv 8: orion.attention.head_count u32 = 40
llama_model_loader: - kv 9: orion.attention.head_count_kv u32 = 40
llama_model_loader: - kv 10: orion.attention.layer_norm_epsilon f32 = 0.000010
llama_model_loader: - kv 11: tokenizer.ggml.model str = llama
llama_model_loader: - kv 12: tokenizer.ggml.tokens arr[str,84608] = ["<unk>", "<s>", "</s>", " ", "ββ...
llama_model_loader: - kv 13: tokenizer.ggml.scores arr[f32,84608] = [0.000000, 0.000000, 0.000000, 0.0000...
llama_model_loader: - kv 14: tokenizer.ggml.token_type arr[i32,84608] = [2, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 15: tokenizer.ggml.bos_token_id u32 = 1
llama_model_loader: - kv 16: tokenizer.ggml.eos_token_id u32 = 2
llama_model_loader: - kv 17: tokenizer.ggml.padding_token_id u32 = 0
llama_model_loader: - kv 18: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 19: tokenizer.ggml.add_eos_token bool = false
llama_model_loader: - kv 20: tokenizer.chat_template str = {% for message in messages %}{% if lo...
llama_model_loader: - kv 21: general.quantization_version u32 = 2
llama_model_loader: - type f32: 162 tensors
llama_model_loader: - type q8_0: 282 tensors
error loading model: unknown model architecture: 'orion'
llama_load_model_from_file: failed to load model
AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 |
22:11:01-327344 ERROR Failed to load the model.
Traceback (most recent call last):
File "/home/noah/Documents/AI/text-generation-webui/modules/ui_model_menu.py", line 213, in load_model_wrapper
shared.model, shared.tokenizer = load_model(selected_model, loader)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/noah/Documents/AI/text-generation-webui/modules/models.py", line 87, in load_model
output = load_func_map[loader](model_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/noah/Documents/AI/text-generation-webui/modules/models.py", line 250, in llamacpp_loader
model, tokenizer = LlamaCppModel.from_pretrained(model_file)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/noah/Documents/AI/text-generation-webui/modules/llamacpp_model.py", line 101, in from_pretrained
result.model = Llama(**params)
^^^^^^^^^^^^^^^
File "/home/noah/Documents/AI/text-generation-webui/installer_files/env/lib/python3.11/site-packages/llama_cpp_cuda/llama.py", line 962, in __init__
self._n_vocab = self.n_vocab()
^^^^^^^^^^^^^^
File "/home/noah/Documents/AI/text-generation-webui/installer_files/env/lib/python3.11/site-packages/llama_cpp_cuda/llama.py", line 2274, in n_vocab
return self._model.n_vocab()
^^^^^^^^^^^^^^^^^^^^^
File "/home/noah/Documents/AI/text-generation-webui/installer_files/env/lib/python3.11/site-packages/llama_cpp_cuda/llama.py", line 251, in n_vocab
assert self.model is not None
^^^^^^^^^^^^^^^^^^^^^^
AssertionError
Exception ignored in: <function LlamaCppModel.__del__ at 0x7f14f08079c0>
Traceback (most recent call last):
File "/home/noah/Documents/AI/text-generation-webui/modules/llamacpp_model.py", line 58, in __del__
del self.model
^^^^^^^^^^
AttributeError: 'LlamaCppModel' object has no attribute 'model'
it seems llama.cpp still dosent support this model, you may request an issue to https://github.com/ggerganov/llama.cpp