Inferencing the model

by saivineetha - opened Jan 2, 2024

Jan 2, 2024

Hi,

I'm trying to inference the model using pipeline but I'm getting blank response. I have used the example prompt given in paper as Open Question Answering Task.
I'm attaching the code I used for inference

model = AutoModelForCausalLM.from_pretrained('abhinand/tamil-llama-7b-base-v0.1')
tokenizer = AutoTokenizer.from_pretrained('abhinand/tamil-llama-7b-base-v0.1')
txt = "ஐபிஎல் ெதாடைர ெசன் ைன சூப் பர் கிங் ஸ் (சிஎஸ் -
ேக) ெவன் றது என் ற தைலப் பில் ஒரு சிறு ெசய் திக் கட் டுைர-
ைய எழுதுங் கள் ." # Taken from paper

sequences = pipeline(
txt,
do_sample=True,
top_k=10,
temperature = 0.2,
max_length=1024,
)
for seq in sequences:
print(f"Result: {seq['generated_text']}")

I'm getting the blank response. Only the prompt passed is shown in output. I'm attaching the image for the same

Can anyone help me with this.

abhinand

Owner Jan 2, 2024

You have mentioned Open Question Answering Task, and you are using the base model.

In simple terms, there are two types of LMs: (if you aren't aware)

Base Model: Trained on huge amounts of text data and are suitable for CLM (next word prediction) tasks.
Fine-tuned Model: The base model is finetuned on an instruction or a chat dataset making it suitable for interaction with humans.

So in your case you need to use the instruct model (unless you are willing to do a massive domain adaptation or finetuning on diverse datasets).

tokenizer = AutoTokenizer.from_pretrained("abhinand/tamil-llama-7b-instruct-v0.1")
model = AutoModelForCausalLM.from_pretrained(
    "abhinand/tamil-llama-7b-instruct-v0.1",
   # OTHER MODEL ARGUMENTS HERE
)
model.eval()

generation_config = GenerationConfig(
    temperature=0.3,
    top_k=50,
    top_p=0.90,
    repetition_penalty=1.1,
    max_length=512,
    eos_token_id=tokenizer.eos_token_id,
    do_sample=True,
    max_new_tokens=128,
)

def format_instruction(system_prompt, question, input=None):
    if input is not None:
        return f"""{system_prompt}

### Instruction:
{question}

### Input:
{input}

### Response:
"""
    else:
        return f"""{system_prompt}

### Instruction:
{question}

### Response:
"""

device = "cuda" if torch.cuda.is_available() else "cpu"

def run_inference(prompt):
    input_ids = tokenizer.encode(prompt, return_tensors="pt").to(device)
    output = model.generate(input_ids, generation_config=generation_config, pad_token_id=18610)

    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

    return generated_text


SYS_PROMPT1 = "நீங்கள் தமிழில் பதிலளிக்கும் AI உதவியாளர். பயனர் உங்களுக்கு ஒரு பணியை வழங்குவார். உங்களால் முடிந்தவரை உண்மையாக பணியை முடிப்பதே உங்கள் குறிக்கோள். பணியைச் செய்யும்போது, படிப்படியாக சிந்தித்து, உங்கள் நடவடிக்கைகளை நியாயப்படுத்தவும்."

instruction = format_instruction(
    system_prompt=SYS_PROMPT1,
    question="""DNA மற்றும் RNA இடையே உள்ள வேறுபாட்டை ஒரு வரியில் விளக்கவும்"""
)

output = run_inference(instruction)

print(output)

saivineetha

Jan 2, 2024

Thank you for the reply. It was working.

I want to infer the base model so that I could see how the base model works and later use code to my own dataset i.e., to do domain-adaptation.

How can I do inference on base model "abhinand/tamil-llama-7b-base-v0.1".
How to do text generation using base model.
Can I use the code
generator = pipeline(task="text_generation", model="abhinand/tamil-llama-7b-base-v0.1")

abhinand

Owner Jan 2, 2024

Sure! You can use the pipeline and test out the model for your needs.

Below is an example:

saivineetha

Jan 2, 2024

Thanks a lot!

saivineetha changed discussion status to closed Jan 2, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment