Python instructions didn't just go

#2
by Njax - opened

I'm very new to GPTQ, so please excuse this message if I'm in error. However, I couldn't get the example python code to just work. I wound up changing it to this:

model = AutoGPTQForCausalLM.from_quantized(model_basename,
    use_safetensors=True,
    trust_remote_code=True,
    device="cuda:0",
    use_triton=use_triton,
    quantize_config=None)

user_input = '''
// A javascript function
function printHelloWorld() {
'''

inputs = tokenizer(user_input, return_tensors="pt").to(model.device)
embedding = model.generate(**inputs,
    max_new_tokens=40)[0]
outputs = tokenizer.decode(embedding)

I used cuda 1.17, torch 2.0.1+cu117, and auto-gptq 0.2.2, which perhaps spells the difference.

Thanks for uploading this. Confusing stuff at times but it sure is exciting for something new!

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment