How do I run this on cpu?
I am a bit new to this, any help appreciated; thank you!
Hi! Can you try this, but might be a bit slow:
#Load model
import transformers, torch
compute_dtype = torch.float32
cache_path = ''
device = 'cpu'
model_id = "mobiuslabsgmbh/aanaphi2-v0.1"
model = transformers.AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=compute_dtype,
cache_dir=cache_path,
device_map=device)
tokenizer = transformers.AutoTokenizer.from_pretrained(model_id, cache_dir=cache_path)
#Set Prompt format
instruction_template = "### Human: "
response_template = "### Assistant: "
def prompt_format(prompt):
out = instruction_template + prompt + '\n' + response_template
return out
model.eval();
def generate(prompt, max_length=1024):
prompt_chat = prompt_format(prompt)
inputs = tokenizer(prompt_chat, return_tensors="pt", return_attention_mask=True).to(device)
outputs = model.generate(**inputs, max_length=max_length, eos_token_id= tokenizer.eos_token_id)
text = tokenizer.batch_decode(outputs[:,:-1])[0]
return text
#Generate
print(generate('If A+B=C and B=C, what would be the value of A?'))
Hi! Can you try this, but might be a bit slow:
#Load model import transformers, torch compute_dtype = torch.float32 cache_path = '' device = 'cpu' model_id = "mobiuslabsgmbh/aanaphi2-v0.1" model = transformers.AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=compute_dtype, cache_dir=cache_path, device_map=device) tokenizer = transformers.AutoTokenizer.from_pretrained(model_id, cache_dir=cache_path) #Set Prompt format instruction_template = "### Human: " response_template = "### Assistant: " def prompt_format(prompt): out = instruction_template + prompt + '\n' + response_template return out model.eval(); def generate(prompt, max_length=1024): prompt_chat = prompt_format(prompt) inputs = tokenizer(prompt_chat, return_tensors="pt", return_attention_mask=True).to(device) outputs = model.generate(**inputs, max_length=max_length, eos_token_id= tokenizer.eos_token_id) text = tokenizer.batch_decode(outputs[:,:-1])[0] return text #Generate print(generate('If A+B=C and B=C, what would be the value of A?'))
Thank you for your answer! I tried this, but I get a Killed signal. I checked my cpu and memory states while it was executing and none of them were close to being full. I wonder what might be the issue!
Strange, it shouldn't use too much RAM to get killed!
You can also just run it on Google colab with the free GPU, I have just tried there and it works fine. You'll need to install the following before you run the code:pip install transformers, accelerate
Strange, it shouldn't use too much RAM to get killed!
You can also just run it on Google colab with the free GPU, I have just tried there and it works fine. You'll need to install the following before you run the code:
pip install transformers, accelerate
Thank you so much for all your help, I really appreciate it!
Happy to help!