I am getting errors, when I am trying to launch this in text generation webui
I have installed this model with the one click install to my oogabooga webui, but can't get it to run. When I use the default start-windows.bat, it throws this error: OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory models\TheBloke_wizardLM-7B-GPTQ. Now, I have read in your repo, that I should use this command line command to launch it (cd text-generation-webui
python server.py --model wizardLM-7B-GPTQ --wbits 4 --groupsize 128 --model_type Llama # add any other command line args you want), but I am not really shure, where to run it, so I tried to run it from the text generation webui folder. It kind of worked, but it threw this error: File "C:\AI\oobabooga_windows\oobabooga_windows\text-generation-webui\server.py", line 17, in
import gradio as gr
ModuleNotFoundError: No module named 'gradio'. This is kind of driving me nuts, does anyone know a fix?
You can convert the safetensors model, there is a ckpt2safetensors program that claims you can go back to ckpt but can't confirm that.
I have installed this model with the one click install to my oogabooga webui, but can't get it to run. When I use the default start-windows.bat, it throws this error: OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory models\TheBloke_wizardLM-7B-GPTQ. Now, I have read in your repo, that I should use this command line command to launch it (cd text-generation-webui
python server.py --model wizardLM-7B-GPTQ --wbits 4 --groupsize 128 --model_type Llama # add any other command line args you want), but I am not really shure, where to run it, so I tried to run it from the text generation webui folder. It kind of worked, but it threw this error: File "C:\AI\oobabooga_windows\oobabooga_windows\text-generation-webui\server.py", line 17, in
import gradio as gr
ModuleNotFoundError: No module named 'gradio'. This is kind of driving me nuts, does anyone know a fix?
It should definitely be possible to load during the one-click-installer, as many other people are using this. But yes you must specify the command line arguments --wbits 4 --groupsize 128 --model_type llama
I'm afraid I don't know how you're meant to specify that for the one-click-install.
The error you got trying to run it manually suggested that it couldn't find one of the Python dependencies, gradio. You must have that installed if text-generation-webui works in general, so that suggests that you haven't found the right way of executing server.py.
How do you launch text-generation-webui at the moment? Do you have a shortcut or something? If so, try right-clicking on it and viewing its properties, see if you can see some command line arguments in there. Show me a screenshot if you like and I can tell you if it's the right place to add --wbits 4 --groupsize 128 --model_type llama
I don't use Windows and I can't really test it so other than that I'm not sure what to suggest. But it's definitely possible to get it working.
You can convert the safetensors model, there is a ckpt2safetensors program that claims you can go back to ckpt but can't confirm that.
It's not related to it being a safetensors file. That will load fine if the correct command line arguments are given to text-generation-webui
Great, thanks for the details Anthrax
Just so I'm clear - to get the act-order file working on Windows, you downloaded and self-compiled the CUDA branch of GPTQ-for-LLaMa? It still doesn't work with the pre-compiled ooba fork in the one-click-installers?
Thanks so much for all the helpfull comments. I solved it now by just reinstalling with the guide from Aitrepreneur.
Yeah sorry, until a few minutes ago ooba would automatically install the file that requires the latest GPTQ-for-LLaMa code.
I've fixed that now by renaming the files so the no-act-order file will always be loaded in preference to the act-order file, And I've put these instructions for easy installation in the README:
How to easily download and use this model in text-generation-webui
Load text-generation-webui as you normally do.
- Click the Model tab.
- Under Download custom model or LoRA, enter this repo name:
TheBloke/stable-vicuna-13B-GPTQ
. - Click Download.
- Wait until it says it's finished downloading.
- As this is a GPTQ model, fill in the
GPTQ parameters
on the right:Bits = 4
,Groupsize = 128
,model_type = Llama
- Now click the Refresh icon next to Model in the top left.
- In the Model drop-down: choose this model:
stable-vicuna-13B-GPTQ
. - Click Reload the Model in the top right.
- Once it says it's loaded, click the Text Generation tab and enter a prompt!
Is it possible to start the WebUI without a model if there is an error loading a model? That would be better instead of just ending the program, then the user can load a model and set the parameters in the WebUI instead.
Modify the webui.py , specifically this line :
run_cmd("python server.py --chat --model-menu") # put your flags here!
to this:
run_cmd("python server.py --chat") # put your flags here!
use this one instead:
run_cmd("python server.py --chat --auto-device") # put your flags here!
Ok, thanks. I would hope that they could make a change to the error routine so that it doesn't end the program as failing to load a model isn't a critical error (the WebUI works without any model loaded).
I was experimenting with oobabooga and noticed something weird.
It downloads compat.no-act-order file first as expected.
But if both safetensors files are downloaded, it loads the one that comes last alphabetically, loading latest.act-order file.
Sorry for all the renaming trouble. Seems like deleting the latest.act-order file is the easiest solution for oobabooga users currently.
Oh you're kidding me! I renamed them all because I thought it loaded the first! I suppose I should have actually tested it :D
OK thanks, I will rename them all back I guess..
On my newer repos I'm going for a more sophisticated system where there's only one model per repo, and other versions are available via branches. I put in a PR with ooba to enable auto download from different branches, eg with TheBloke/stable-vicuna-13B-GPTQ:latest
. Not been merged yet though