To use, do:

from peft import PeftModel, PeftConfig
from transformers import AutoTokenizer
ref_model = AutoModelForCausalLM.from_pretrained("EleutherAI/pythia-70m-deduped-v0", torch_dtype=torch.bfloat16)
peft_model_id = "w601sxs/pythia-70m-instruct-orca-chkpt-64000"

config = PeftConfig.from_pretrained(peft_model_id)
model = PeftModel.from_pretrained(ref_model, peft_model_id)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)

model = model.to('cuda:0')
model.eval()


inputs = tokenizer(prompt, return_tensors="pt")

with torch.no_grad():
    outputs = model.generate(input_ids=inputs["input_ids"].to("cuda"), max_new_tokens=10)
    print(tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0]

Prompt format

context: < ... >
question: < ... >
answer: < ... >

For e.g.

context: <You are an AI assistant. User will you give you a task. Your goal is to complete the task as faithfully as you can. While performing the task think step-by-step and justify your steps.>
 question: <Here is some data: The Rice Boat eatType restaurant; The Rice Boat food Fast food; The Rice Boat familyFriendly yes; The Rice Boat near Express by Holiday Inn.

Write a sentence that describes this data:>
 answer: <
Downloads last month
15
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train w601sxs/pythia-70m-instruct-orca-chkpt-64000