runtime error

:36<00:08, 177MB/s] (…)-2-7b-chat-codeCherryPop.ggmlv3.q6_K.bin: 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 4.30G/5.53G [00:40<00:10, 118MB/s] (…)-2-7b-chat-codeCherryPop.ggmlv3.q6_K.bin: 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 4.50G/5.53G [00:41<00:08, 122MB/s] (…)-2-7b-chat-codeCherryPop.ggmlv3.q6_K.bin: 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 4.88G/5.53G [00:42<00:04, 162MB/s] (…)-2-7b-chat-codeCherryPop.ggmlv3.q6_K.bin: 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 5.25G/5.53G [00:44<00:01, 198MB/s] (…)-2-7b-chat-codeCherryPop.ggmlv3.q6_K.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 5.51G/5.53G [00:45<00:00, 207MB/s] (…)-2-7b-chat-codeCherryPop.ggmlv3.q6_K.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 5.53G/5.53G [00:45<00:00, 122MB/s] gguf_init_from_file: invalid magic characters 'tjgg' error loading model: llama_model_loader: failed to load model from /home/user/.cache/huggingface/hub/models--TheBloke--llama2-7b-chat-codeCherryPop-qLoRA-GGML/snapshots/1fc5a35aa955b85102e23c69028a9ba1af575f2e/llama-2-7b-chat-codeCherryPop.ggmlv3.q6_K.bin llama_load_model_from_file: failed to load model AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | Traceback (most recent call last): File "/home/user/app/app.py", line 10, in <module> llm = Llama(model_path= hf_hub_download(repo_id="TheBloke/llama2-7b-chat-codeCherryPop-qLoRA-GGML", filename="llama-2-7b-chat-codeCherryPop.ggmlv3.q6_K.bin"), n_ctx=2048) #download model from hf/ n_ctx=2048 for high ccontext length File "/home/user/.local/lib/python3.10/site-packages/llama_cpp/llama.py", line 962, in __init__ self._n_vocab = self.n_vocab() File "/home/user/.local/lib/python3.10/site-packages/llama_cpp/llama.py", line 2276, in n_vocab return self._model.n_vocab() File "/home/user/.local/lib/python3.10/site-packages/llama_cpp/llama.py", line 251, in n_vocab assert self.model is not None AssertionError

Container logs:

Fetching error logs...