AUTOGPTQ Error in Google Colab

#25

by echogit - opened Oct 13, 2023

Oct 13, 2023

When trying to load the model in google colab, I get the error:

ImportError: Loading a GPTQ quantized model requires optimum (pip install optimum) and auto-gptq library (pip install auto-gptq)

My code has following:

!pip install -q -U transformers peft accelerate optimum
!pip install auto-gptq

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("TheBloke/Llama-2-7b-Chat-GPTQ")
model = AutoModelForCausalLM.from_pretrained("TheBloke/Llama-2-7b-Chat-GPTQ") #ERROR HAPPENS HERE

If I try a different 7b gptq model it does't give the error, for example:
model = AutoModelForCausalLM.from_pretrained("edumunozsala/llama-2-7b-int4-python-code-20k")

TheBloke

Owner Oct 14, 2023

Not sure why it's working with that other model and not this one. But please try installing AutoGPTQ as follows:

!pip install auto-gptq --extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/

GrigoryT22

Oct 15, 2023

as i understand error might be because transformers lib can not check existance of auto-gpt lib
_is_package_available function in transformers uses this code: "package_exists = importlib.util.find_spec(pkg_name) is not None"
the error might me in importlib lib, I cant find "util" module in it (Python 3.10.12, kaggle notebook)

Omkar008

Oct 25, 2023

yeah I am facing the same error ? But I got it working by using langchain Ctransformers .CTransformers(model="TheBloke/Llama-2-7b-Chat-GPTQ"). But still I want to download this model using pretrained and then use it like on a local hardware .

Omkar008

Oct 25, 2023

is there any solution ?

echogit

Oct 26, 2023

@TheBloke solution worked for me

aritrasen

Nov 3, 2023

Please help with this error.

from transformers import AutoTokenizer, pipeline, logging, AutoModelForCausalLM
#from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig

model_name_or_path = "TheBloke/Llama-2-7b-Chat-GPTQ"
model_basename = "model"

use_triton = False

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)

model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
model_basename=model_basename,
use_safetensors=True,
trust_remote_code=True,
device="cuda:0",
use_triton=use_triton,
quantize_config=None)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment