I am getting errors, when I am trying to launch this in text generation webui

#5
by MortySmith - opened

I have installed this model with the one click install to my oogabooga webui, but can't get it to run. When I use the default start-windows.bat, it throws this error: OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory models\TheBloke_wizardLM-7B-GPTQ. Now, I have read in your repo, that I should use this command line command to launch it (cd text-generation-webui
python server.py --model wizardLM-7B-GPTQ --wbits 4 --groupsize 128 --model_type Llama # add any other command line args you want), but I am not really shure, where to run it, so I tried to run it from the text generation webui folder. It kind of worked, but it threw this error: File "C:\AI\oobabooga_windows\oobabooga_windows\text-generation-webui\server.py", line 17, in
import gradio as gr
ModuleNotFoundError: No module named 'gradio'. This is kind of driving me nuts, does anyone know a fix?

You can convert the safetensors model, there is a ckpt2safetensors program that claims you can go back to ckpt but can't confirm that.

https://www.reddit.com/r/StableDiffusion/comments/zhozkz/safe_stable_ckpt2safetensors_conversion_toolgui/

I have installed this model with the one click install to my oogabooga webui, but can't get it to run. When I use the default start-windows.bat, it throws this error: OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory models\TheBloke_wizardLM-7B-GPTQ. Now, I have read in your repo, that I should use this command line command to launch it (cd text-generation-webui
python server.py --model wizardLM-7B-GPTQ --wbits 4 --groupsize 128 --model_type Llama # add any other command line args you want), but I am not really shure, where to run it, so I tried to run it from the text generation webui folder. It kind of worked, but it threw this error: File "C:\AI\oobabooga_windows\oobabooga_windows\text-generation-webui\server.py", line 17, in
import gradio as gr
ModuleNotFoundError: No module named 'gradio'. This is kind of driving me nuts, does anyone know a fix?

It should definitely be possible to load during the one-click-installer, as many other people are using this. But yes you must specify the command line arguments --wbits 4 --groupsize 128 --model_type llama

I'm afraid I don't know how you're meant to specify that for the one-click-install.

The error you got trying to run it manually suggested that it couldn't find one of the Python dependencies, gradio. You must have that installed if text-generation-webui works in general, so that suggests that you haven't found the right way of executing server.py.

How do you launch text-generation-webui at the moment? Do you have a shortcut or something? If so, try right-clicking on it and viewing its properties, see if you can see some command line arguments in there. Show me a screenshot if you like and I can tell you if it's the right place to add --wbits 4 --groupsize 128 --model_type llama

I don't use Windows and I can't really test it so other than that I'm not sure what to suggest. But it's definitely possible to get it working.

You can convert the safetensors model, there is a ckpt2safetensors program that claims you can go back to ckpt but can't confirm that.

It's not related to it being a safetensors file. That will load fine if the correct command line arguments are given to text-generation-webui

To add "--wbits 4 --groupsize 128 --model_type llama", edit line 146 of "webui.py" file in oobabooga folder.
image.png

After using CUDA branch of GPTQ-for-LLaMa, even the act-order model is working perfectly on oobabooga on windows.

Great, thanks for the details Anthrax

Just so I'm clear - to get the act-order file working on Windows, you downloaded and self-compiled the CUDA branch of GPTQ-for-LLaMa? It still doesn't work with the pre-compiled ooba fork in the one-click-installers?

Thanks so much for all the helpfull comments. I solved it now by just reinstalling with the guide from Aitrepreneur.

I just followed instructions from your vicuna-7B-1.1-GPTQ-4bit-128g page, and it works now.
image.png

The one that ooba installed automatically still gives random characters for me.
image.png

Yeah sorry, until a few minutes ago ooba would automatically install the file that requires the latest GPTQ-for-LLaMa code.

I've fixed that now by renaming the files so the no-act-order file will always be loaded in preference to the act-order file, And I've put these instructions for easy installation in the README:

How to easily download and use this model in text-generation-webui

Load text-generation-webui as you normally do.

  1. Click the Model tab.
  2. Under Download custom model or LoRA, enter this repo name: TheBloke/stable-vicuna-13B-GPTQ.
  3. Click Download.
  4. Wait until it says it's finished downloading.
  5. As this is a GPTQ model, fill in the GPTQ parameters on the right: Bits = 4, Groupsize = 128, model_type = Llama
  6. Now click the Refresh icon next to Model in the top left.
  7. In the Model drop-down: choose this model: stable-vicuna-13B-GPTQ.
  8. Click Reload the Model in the top right.
  9. Once it says it's loaded, click the Text Generation tab and enter a prompt!

Is it possible to start the WebUI without a model if there is an error loading a model? That would be better instead of just ending the program, then the user can load a model and set the parameters in the WebUI instead.

Modify the webui.py , specifically this line :
run_cmd("python server.py --chat --model-menu") # put your flags here!
to this:
run_cmd("python server.py --chat") # put your flags here!

use this one instead:

run_cmd("python server.py --chat --auto-device") # put your flags here!

Ok, thanks. I would hope that they could make a change to the error routine so that it doesn't end the program as failing to load a model isn't a critical error (the WebUI works without any model loaded).

I was experimenting with oobabooga and noticed something weird.
It downloads compat.no-act-order file first as expected.
But if both safetensors files are downloaded, it loads the one that comes last alphabetically, loading latest.act-order file.
Sorry for all the renaming trouble. Seems like deleting the latest.act-order file is the easiest solution for oobabooga users currently.

Oh you're kidding me! I renamed them all because I thought it loaded the first! I suppose I should have actually tested it :D

OK thanks, I will rename them all back I guess..

On my newer repos I'm going for a more sophisticated system where there's only one model per repo, and other versions are available via branches. I put in a PR with ooba to enable auto download from different branches, eg with TheBloke/stable-vicuna-13B-GPTQ:latest. Not been merged yet though

Sign up or log in to comment