TheBloke/falcon-40b-instruct-GGML · I tried to load falcon40b-instruct.ggmlv3.q8_0.bin directly into text-generation-webui but I got this error

Jun 23, 2023

I tried to load falcon40b-instruct.ggmlv3.q8_0.bin directly into text-generation-webui but I got this error.
Is there a way to solve this problem?

2023-06-23 13:07:29 INFO:llama.cpp weights detected: models/falcon40b-instruct.ggmlv3.q8_0.bin

2023-06-23 13:07:29 INFO:Cache capacity is 0 bytes
llama.cpp: loading model from models/falcon40b-instruct.ggmlv3.q8_0.bin
error loading model: missing tok_embeddings.weight
llama_init_from_file: failed to load model
2023-06-23 13:07:29 ERROR:Failed to load the model.
Traceback (most recent call last):
File "/lhome/saidaner/text-generation-webui/server.py", line 62, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
File "/lhome/saidaner/text-generation-webui/modules/models.py", line 66, in load_model
output = load_func_maploader
File "/lhome/saidaner/text-generation-webui/modules/models.py", line 247, in llamacpp_loader
model, tokenizer = LlamaCppModel.from_pretrained(model_file)
File "/lhome/saidaner/text-generation-webui/modules/llamacpp_model.py", line 55, in from_pretrained
result.model = Llama(**params)
File "/lhome/saidaner/miniconda3/envs/textgen/lib/python3.10/site-packages/llama_cpp/llama.py", line 287, in init
assert self.ctx is not None
AssertionError

Exception ignored in: <function LlamaCppModel.__del__ at 0x155414974c10>
Traceback (most recent call last):
File "/lhome/saidaner/text-generation-webui/modules/llamacpp_model.py", line 29, in del
self.model.del()
AttributeError: 'LlamaCppModel' object has no attribute 'model'

TheBloke

Owner Jun 23, 2023

Please see the README. text-generation-webui does not support Falcon GGML at this time.

You can try LoLLMS Web UI which just added support for Falcon GGML.

Azamorn

Jul 5, 2023

There is a pull request to let it work in text-generation-webui, see here
https://github.com/oobabooga/text-generation-webui/pull/2828

TheBloke

Owner Jul 7, 2023

@Azamorn interesting, thanks for the link. That's for a different model backend. It won't support this Falcon model but would support other Falcon models made with CTranslate2. I will keep an eye on that and if it's merged I will look at some making CTranslate2 models.