Text Generation
Transformers
Safetensors
English
llama
causal-lm
text-generation-inference
4-bit precision

Could not find model in TheBloke/stable-vicuna-13B-GPTQ

#27
by AB00k - opened

Could not find model in TheBloke/stable-vicuna-13B-GPTQ

I'm having above error when trying to load it using following code is it not et available?
from transformers import AutoTokenizer, pipeline, logging
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
import argparse

change this path to match where you downloaded the model

quantized_model_dir = "TheBloke/stable-vicuna-13B-GPTQ"

model_basename = "stable-vicuna-13B-GPTQ-4bit.compat.no-act-order.safetensors"

use_triton = False

tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir, use_fast=True)

quantize_config = BaseQuantizeConfig(
bits=4,
group_size=128,
desc_act=False
)

model = AutoGPTQForCausalLM.from_quantized(quantized_model_dir,
use_safetensors=True,
model_basename=model_basename,
device="cuda:0",
use_triton=use_triton,
quantize_config=quantize_config)

@TheBloke can you please clarify this

Don't put .safetensors on the end of model basename

model_basename = "stable-vicuna-13B-GPTQ-4bit.compat.no-act-order"

Thanks it worked!

Sign up or log in to comment