TheBloke/stable-vicuna-13B-GPTQ · some errors occured when I use the webui who can tell me why?

May 5, 2023

Traceback (most recent call last): File “/Users/dev/WebstormProjects/text-generation-webui/server.py”, line 103, in load_model_wrapper shared.model, shared.tokenizer = load_model(shared.model_name) File “/Users/dev/WebstormProjects/text-generation-webui/modules/models.py”, line 85, in load_model model = LoaderClass.from_pretrained(Path(f"{shared.args.model_dir}/{model_name}"), low_cpu_mem_usage=True, torch_dtype=torch.bfloat16 if shared.args.bf16 else torch.float16, trust_remote_code=trust_remote_code) File “/usr/local/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py”, line 471, in from_pretrained return model_class.from_pretrained( File “/usr/local/lib/python3.9/site-packages/transformers/modeling_utils.py”, line 2405, in from_pretrained raise EnvironmentError( OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory models/stable-vicuna-13B-GPTQ.

TheBloke

Owner May 5, 2023

Please follow the instructions in the README - you need to set the GPTQ parameters for the model: bits = 4, groupsize = 128, model_type = llama. Then "save settings for this model" and "reload model"

yogurt111

May 5, 2023

can this webui use other models?

TheBloke

Owner May 5, 2023

Yes of course - it supports most models out there. It supports unquantised fp16 HF models for GPU, the same models in 8bit on GPU, these GPTQ models in 4bit on GPU. It supports Llama models, GPT-J, OPT, GPTNeoX, RWKV, and others.

And it also supports loading ggml Llama GGML models for CPU inference.

text-gen-ui supports more models than any other UI right now I would think.

yogurt111

May 5, 2023

Thanks