Model cannot be found when using with the auto_gptq library
Hi there I'm trying to run an instance of this model on google colab with the below code
from transformers import AutoTokenizer, TextGenerationPipeline
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
import logging
quantized_model_dir = "/content/drive/MyDrive/Wizard-Vicuna-13B-Uncensored-GPTQ"
tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir, use_fast=True)
try:
quantize_config = BaseQuantizeConfig.from_pretrained(quantized_model_dir)
except:
quantize_config = BaseQuantizeConfig(
bits=4,
group_size=128
)
# download quantized model from Hugging Face Hub and load to the first GPU
model = AutoGPTQForCausalLM.from_quantized(
quantized_model_dir,
device="cuda:0",
use_safetensors=True,
use_triton=False,
quantize_config=quantize_config
)
pipeline = TextGenerationPipeline(model=model, tokenizer=tokenizer)
print(pipeline("auto-gptq is")[0]["generated_text"])
However regardless of whether I set the quantized_model_dir as a location in my drive or in the temporary file location it will always return with the below traceback:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ Traceback (most recent call last) โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ in <cell line: 18>:18 โ
โ โ
โ /usr/local/lib/python3.10/dist-packages/auto_gptq/modeling/auto.py:82 in from_quantized โ
โ โ
โ 79 โ โ model_type = check_and_get_model_type(save_dir or model_name_or_path, trust_remo โ
โ 80 โ โ quant_func = GPTQ_CAUSAL_LM_MODEL_MAP[model_type].from_quantized โ
โ 81 โ โ keywords = {key: kwargs[key] for key in signature(quant_func).parameters if key โ
โ โฑ 82 โ โ return quant_func( โ
โ 83 โ โ โ model_name_or_path=model_name_or_path, โ
โ 84 โ โ โ save_dir=save_dir, โ
โ 85 โ โ โ device_map=device_map, โ
โ โ
โ /usr/local/lib/python3.10/dist-packages/auto_gptq/modeling/_base.py:698 in from_quantized โ
โ โ
โ 695 โ โ โ โ โ break โ
โ 696 โ โ โ
โ 697 โ โ if resolved_archive_file is None: # Could not find a model file to use โ
โ โฑ 698 โ โ โ raise FileNotFoundError(f"Could not find model in {model_name_or_path}") โ
โ 699 โ โ โ
โ 700 โ โ model_save_name = resolved_archive_file โ
โ 701 โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
FileNotFoundError: Could not find model in /content/drive/MyDrive/Wizard-Vicuna-13B-Uncensored-GPTQ
I've made sure I downloaded all the files from huggingface (including the tokenizer_config.json file) and have them in my drive.
Does anyone know why it won't detect the model?
Yeah you need to specify model_basename
model_basename = "Wizard-Vicuna-13B-Uncensored-GPTQ-4bit-128g.compat.no-act-order"
model = AutoGPTQForCausalLM.from_quantized(
quantized_model_dir,
model_basename=model_basename
device="cuda:0",
use_safetensors=True,
use_triton=False,
quantize_config=quantize_config
)
So that it knows what the name of the safetensors file is.
Oh I realised I was missing the model_basename argument. All good, thanks!
Yeah you need to specify
model_basename
model_basename = "Wizard-Vicuna-13B-Uncensored-GPTQ-4bit-128g.compat.no-act-order" model = AutoGPTQForCausalLM.from_quantized( quantized_model_dir, model_basename=model_basename device="cuda:0", use_safetensors=True, use_triton=False, quantize_config=quantize_config )
So that it knows what the name of the safetensors file is.
Thanks so much, I just realised the legend of the forum answered after I posed my last comment. Cheers!
You're welcome!