runtime error

Downloading model: LinkSoul/Chinese-Llama-2-7b-ggml/Chinese-Llama-2-7b.ggmlv3.q4_0.bin Chinese-Llama-2-7b.ggmlv3.q4_0.bin: 0%| | 0.00/3.83G [00:00<?, ?B/s] Chinese-Llama-2-7b.ggmlv3.q4_0.bin: 1%| | 21.0M/3.83G [00:03<10:58, 5.77MB/s] Chinese-Llama-2-7b.ggmlv3.q4_0.bin: 28%|██▊ | 1.07G/3.83G [00:06<00:14, 185MB/s]  Chinese-Llama-2-7b.ggmlv3.q4_0.bin: 37%|███▋ | 1.41G/3.83G [00:09<00:14, 166MB/s] Chinese-Llama-2-7b.ggmlv3.q4_0.bin: 55%|█████▌ | 2.12G/3.83G [00:12<00:09, 186MB/s] Chinese-Llama-2-7b.ggmlv3.q4_0.bin: 64%|██████▍ | 2.45G/3.83G [00:14<00:07, 186MB/s] Chinese-Llama-2-7b.ggmlv3.q4_0.bin: 83%|████████▎ | 3.17G/3.83G [00:15<00:02, 260MB/s] Chinese-Llama-2-7b.ggmlv3.q4_0.bin: 100%|█████████▉| 3.83G/3.83G [00:16<00:00, 236MB/s] Downloaded /home/user/.cache/huggingface/hub/models--LinkSoul--Chinese-Llama-2-7b-ggml/snapshots/73ad3302529fd9627442cfef0027e8c091917741/Chinese-Llama-2-7b.ggmlv3.q4_0.bin Traceback (most recent call last): File "/home/user/app/app.py", line 6, in <module> from model import get_input_token_length, run File "/home/user/app/model.py", line 22, in <module> llm = Llama(model_path=model_path, n_ctx=4000, verbose=False) File "/home/user/.local/lib/python3.10/site-packages/llama_cpp/llama.py", line 962, in __init__ self._n_vocab = self.n_vocab() File "/home/user/.local/lib/python3.10/site-packages/llama_cpp/llama.py", line 2274, in n_vocab return self._model.n_vocab() File "/home/user/.local/lib/python3.10/site-packages/llama_cpp/llama.py", line 251, in n_vocab assert self.model is not None AssertionError

Container logs:

Fetching error logs...