load model from different branches

#2
by Klimkou - opened

Hi, I was trying to load the 8bit-128g-actorder_False version of the model by using such code:
"""
from transformers import AutoTokenizer, pipeline, logging
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig

model_name_or_path = "TheBloke/vicuna-13b-v1.3.0-GPTQ"
model_basename = "gptq-8bit-128g-actorder_False" #also tried 'gptq_model-8bit-128g', got the same result

use_triton = False

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)

model = AutoGPTQForCausalLM.from_quantized(model_name_or_path,
revision="gptq-8bit-128g-actorder_False",
model_basename=model_basename,
use_safetensors=True,
trust_remote_code=True,
device="cuda:0",
use_triton=use_triton,
quantize_config=None)
"""

but I've got an error FileNotFoundError: Could not find model in TheBloke/vicuna-13b-v1.3.0-GPTQ. It should be said that I don't get an error when I launch the default version from the main branch as it is written in "How to use this GPTQ model from Python code example". What am I doing wrong?
Thanks for your efforts a lot!

Sign up or log in to comment