openlm-research/open_llama_13b · question answering using llama

Hi,

Can someone tell me how to run question answering model using LLama? I'm trying to build QuestionAnswering model , when I run its not giving correct answer even though passing context along it.

from langchain import PromptTemplate, HugginFacePipeline, LLMChain
from transformers import pipeline
import torch
template  = """Read this article and answer the below question {context}\n {question} \n if you don;t know the answer please say "I don't know" don't make up on your own."""
prompt = PromptTemplate(template=template, input_variables=["context", "question"])
model_name = "openlm-research/llama-7b-hf"
hf_pipe = HugginFacePipeline(pipeline=pipeline(model=model_name, device='cuda:0', torch_dtype=torch.float16, max_new_tokens=512))
llm_chain = LLMChain(prompt=prompt, llm=hf_pipe)

torch.cuda.empty.cache()
result = llm_chain.run(context=input_context, question=user_question)

I'm using the above script but it is not giving correct answer even though answer present in the context, can someone help me here?