Failure loading in text-generation-webui.

#1
by Nafnlaus - opened

Using the command:

python server.py --model maddes8cht_tiiuae-falcon-40b-instruct-gguf --listen --listen-port 4664 --verbose --api --xformers

... (or specifying a specific gguf file), I get:

2023-09-24 22:26:10 INFO:Loading settings from settings.json...
Traceback (most recent call last):
File "/home/user/text-generation-webui/server.py", line 216, in
model_settings = get_model_metadata(model_name)
File "/home/user/text-generation-webui/modules/models_settings.py", line 31, in get_model_metadata
for k in settings[pat]:
TypeError: 'NoneType' object is not iterable

I have no problem loading other ggufs.

Can you specify on which quantization levels the error occurs?
All of them?
This was one of my very first models I quantized, and while things where not as smooth in the beginning, i'm pretty sure to have tested them at least in Llama.cpp's main and server.

Did you get any of my other quantized models working ?

All of them, sadly.

I haven't tried any of your other models yet - I was waiting for maddes8cht/ehartford-WizardLM-Uncensored-Falcon-40b, which it looks like is now online, so I'll download it and try it out as soon as I can. :)

Thanks for your work, BTW, on actual open-source-licensed models. I wish more people would pay attention to licensing! LLaMA 2's viral license is a lot more insidious than I think a lot of people realize, harnessing the open-source community to do development for them and only allowing LLaMA 2 derivatives's generations to be used for training other LLaMA 2-derived models, while retaining the full commercial rights if any project ever goes big. FYI, if you're looking for more open-source models, I noticed that TigerBot is Apache licensed. :)

I would really like to get this fixed.
So I am eager to know if the wizardLM files work, please give me feedback.
And it could also be helpful to get a link on some examples of gguf files that do work for you.

I've tested one of the wizardLM files (the smallest one), and it does work. :) Will test more later this evening.

Okay - tht's strange.
I've tested again both of them, and they both work for me in Llama.cpp. I.ve also looked into the file with a hex-viewer if this one, the 40b instruct, does have the same gguf-file version, which it does. In the End I was almost sure that none of my models would work for me and it's possibly a problem with Oobaboga .

But glad to hear that it works.
I tested with the files I quantized and then used for the upload.
I may have to upload the model again - maybe something went wrong with the upload. It took me a while to get the uploads right without always losing the connection with these really big files and an asymmetric internet connection that only provides a fraction of the speed for the upload.

FYI, if you're looking for more open-source models, I noticed that TigerBot is Apache licensed. :)

It's still a Llama 2 based model, and I don't think relicensing it as Apache 2 is totally ok. They are allowed to license their own work as Apache 2, which should be mainly their training dataset, but not the full reaulting model - I think this is exactly what Meta's Llama 2 license prohibits.

So this is a similar blurry situation as with the early Alpaca models based on Llama (1): They released it as open source, which was then their dataset - but the status of the resulting models was very unclear.

BTW, the Bloke already have the TigerBot models converted.

This is about models he doesn't seem to care anymore.
I don't see a point in duplicating work he has already done.

"It's still a Llama 2 based model" - are you sure about that? I was under the impression that it was a foundational model. Says it was trained on 300B tokens, which would be an awful lot for a non-foundational model.

But yeah, no need to duplicate work :)

it's broken again as of 9 ish days ago due to changes to llama.cpp

This model works for me with the latest llama.cpp. (Did not try in text-generation-webui though.)

Yes, all the falcon models where updated somewhere around end of october / begin of novenber and should work on current Llama.cpp versions.
:)

I get this error when loading with llama.cpp on text gen

...
llama_model_loader: - type  f32:  242 tensors
llama_model_loader: - type q8_0:    1 tensors
llama_model_loader: - type q6_K:  241 tensors
ERROR: byte not found in vocab: '
'

Fixed this by reinstalling text gen with newest version

Llama.cpp is under heavy construction, third party software using it (like oobabooga) needs to be updated as well to stay up to date.
There has been several changes regarding falcon support in Llama.cpp which might have taken some time to be implemented in Oobabooga.

Sign up or log in to comment