Running this model on colab

#6
by deeplypeppermint - opened

Hi, I've been having some trouble testing this model on google colab free. There seems to be an error when I try to run AutoGPTQForCausalLM.from_quantized. Could anyone please help?

My code:

!pip install transformers accelerate einops sentencepiece
!git clone https://github.com/PanQiWei/AutoGPTQ
!pip install ./AutoGPTQ/
import torch
from transformers import AutoTokenizer
from auto_gptq import AutoGPTQForCausalLM
model_name_or_path = "TheBloke/koala-13B-GPTQ-4bit-128g"
model_basename = "koala-13B-4bit-128g.safetensors"
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=False)
model = AutoGPTQForCausalLM.from_quantized(model_name_or_path, model_basename=model_basename, device="cuda:0", use_triton=False, use_safetensors=True, torch_dtype=torch.float32, trust_remote_code=False)

My error:

in <cell line: 3>:3 β”‚
β”‚ β”‚
β”‚ /usr/local/lib/python3.10/dist-packages/auto_gptq/modeling/auto.py:85 in from_quantized β”‚
β”‚ β”‚
β”‚ 82 β”‚ β”‚ model_type = check_and_get_model_type(save_dir or model_name_or_path, trust_remo β”‚
β”‚ 83 β”‚ β”‚ quant_func = GPTQ_CAUSAL_LM_MODEL_MAP[model_type].from_quantized β”‚
β”‚ 84 β”‚ β”‚ keywords = {key: kwargs[key] for key in signature(quant_func).parameters if key β”‚
β”‚ ❱ 85 β”‚ β”‚ return quant_func( β”‚
β”‚ 86 β”‚ β”‚ β”‚ model_name_or_path=model_name_or_path, β”‚
β”‚ 87 β”‚ β”‚ β”‚ save_dir=save_dir, β”‚
β”‚ 88 β”‚ β”‚ β”‚ device_map=device_map, β”‚
β”‚ β”‚
β”‚ /usr/local/lib/python3.10/dist-packages/auto_gptq/modeling/_base.py:666 in from_quantized β”‚
β”‚ β”‚
β”‚ 663 β”‚ β”‚ β”‚ raise TypeError(f"{config.model_type} isn't supported yet.") β”‚
β”‚ 664 β”‚ β”‚ β”‚
β”‚ 665 β”‚ β”‚ if quantize_config is None: β”‚
β”‚ ❱ 666 β”‚ β”‚ β”‚ quantize_config = BaseQuantizeConfig.from_pretrained(model_name_or_path, **k β”‚
β”‚ 667 β”‚ β”‚ β”‚
β”‚ 668 β”‚ β”‚ if model_basename is None: β”‚
β”‚ 669 β”‚ β”‚ β”‚ if quantize_config.model_file_base_name: β”‚
β”‚ β”‚
β”‚ /usr/local/lib/python3.10/dist-packages/auto_gptq/modeling/_base.py:90 in from_pretrained β”‚
β”‚ β”‚
β”‚ 87 β”‚ β”‚ β”‚ β”‚ β”‚ _commit_hash=commit_hash, β”‚
β”‚ 88 β”‚ β”‚ β”‚ ) β”‚
β”‚ 89 β”‚ β”‚ β”‚
β”‚ ❱ 90 β”‚ β”‚ with open(resolved_config_file, "r", encoding="utf-8") as f: β”‚
β”‚ 91 β”‚ β”‚ β”‚ return cls(**json.load(f)) β”‚
β”‚ 92 β”‚ β”‚
β”‚ 93 β”‚ def to_dict(self): β”‚
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: expected str, bytes or os.PathLike object, not NoneType

Hey, so did you find a way to run gptq models on colab?

If you are only looking to do inference I was able to run some models on Colab using very simple code taken from other sources:

https://colab.research.google.com/drive/1rqLLYCoD4YlcSkzVp_b0NKpHxUBe3a2S?usp=sharing

I was able to run Falcon (very slow) and Vicuna but not Koala

Thanks but I figured out a way to run llama-2 on colab. It was actually pretty fast

Nice! I haven't tried llama-2 yet but If you don't mind sharing your source it would help me a lot

Sign up or log in to comment