inference example

#2
by rrkotik - opened

Hello, can you provide how to run inference for it?

i tried something like this:

model = transformers.LlamaForCausalLM.from_pretrained("Neko-Institute-of-Science/LLaMA-7B-4bit-128g", load_in_8bit=True, device_map='auto')

and I received error:

ValueError: weight is on the meta device, we need a `value` to put in on 0.

You will need GPTQ for llama to run this.

Sign up or log in to comment