Getting error running inference in Free tier Google Colab

#6
by sudhir2016 - opened

Using this code
from transformers import AutoModelForCausalLM, AutoTokenizer

Colossal-LLaMA-2-7B-base

model = AutoModelForCausalLM.from_pretrained("hpcai-tech/Colossal-LLaMA-2-7b-base", device_map="auto", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("hpcai-tech/Colossal-LLaMA-2-7b-base", trust_remote_code=True)

input = "Capital of India is ?"
inputs = tokenizer(input, return_tensors='pt')
inputs = inputs.to('cuda:0')
pred = model.generate(**inputs,
max_new_tokens=256,
do_sample=True,
temperature=0.3,
top_k=50,
top_p=0.95,
num_return_sequences=1)
print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True)[len(input):])

Getting this error
ValueError: The current device_map had weights offloaded to the disk. Please provide an offload_folder for them. Alternatively, make sure you have safetensors installed if the model you are using offers the weights in this format.

HPC-AI Technology org
edited Feb 20

Hi,

As the error message suggest, you will need to add offload_folder to specify folder path.

Thanks.

Sign up or log in to comment