Does not load

#1
by Toaster496 - opened

doesn't load . what have i missed?

Traceback (most recent call last): File “C:\Users\User\text-generation-webui\server.py”, line 68, in load_model_wrapper shared.model, shared.tokenizer = load_model(shared.model_name, loader) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “C:\Users\User\text-generation-webui\modules\models.py”, line 74, in load_model output = load_func_maploader ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “C:\Users\User\text-generation-webui\modules\models.py”, line 288, in AutoGPTQ_loader return modules.AutoGPTQ_loader.load_quantized(model_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “C:\Users\User\text-generation-webui\modules\AutoGPTQ_loader.py”, line 56, in load_quantized model = AutoGPTQForCausalLM.from_quantized(path_to_model, **params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\auto_gptq\modeling\auto.py”, line 94, in from_quantized return quant_func( ^^^^^^^^^^^ File “C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\auto_gptq\modeling_base.py”, line 749, in from_quantized make_quant( File “C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\auto_gptq\modeling_utils.py”, line 92, in make_quant make_quant( File “C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\auto_gptq\modeling_utils.py”, line 92, in make_quant make_quant( File “C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\auto_gptq\modeling_utils.py”, line 92, in make_quant make_quant( [Previous line repeated 1 more time] File “C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\auto_gptq\modeling_utils.py”, line 84, in make_quant new_layer = QuantLinear( ^^^^^^^^^^^^ File “C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\auto_gptq\nn_modules\qlinear\qlinear_cuda_old.py”, line 83, in init self.autogptq_cuda = autogptq_cuda_256 ^^^^^^^^^^^^^^^^^ NameError: name ‘autogptq_cuda_256’ is not defined

This issue is caused by AutoGPTQ not being correctly compiled.

In general, as you're using text-generation-webui, I suggest you use ExLlama instead if you can. It's faster, uses less VRAM, and is automatically compiled for you by text-generation-webui. The only restriction is that it can't load the 8-bit quants I provided.

You're getting this AutoGPTQ issue because you're using Python 3.11, and there are pre-compiled AutoGPTQ wheels only for Python 3.8, 3.9 and 3.10.

You have two options:

  1. Install and use Python 3.10 instead, ideally in a virtual environment like MiniConda.

In fact the text-generation-webui already provides a one-click installer which installs a Python 3.10 conda environment, with everything installed in it that you need.

So that is my recommendation to you: download and use the text-generation-webui one click installer instead. It should handle everything for you automatically, providing both ExLlama and a working AutoGPTQ.

  1. If you really want to try and get AutoGPTQ running in your manual install with Python 3.11, you would need to try to build it yourself. This requires the CUDA toolkit installed, and is often much more challenging on Windows than it is on Linux. These are the commands you would run:
pip3 uninstall -y auto-gptq
set GITHUB_ACTIONS=true
pip3 install -v auto-gptq
This comment has been hidden

using lm-evaluation-harness autogptq mode, Could not find model
File "/root/anaconda3/envs/infer/lib/python3.9/site-packages/auto_gptq/modeling/_base.py", line 768, in from_quantized
raise FileNotFoundError(f"Could not find model in {model_name_or_path}")
FileNotFoundError: Could not find model in TheBloke/Llama-2-70B-GPTQ

Sign up or log in to comment