JoAnn Alvarez

ruddjm

AI & ML interests

None yet

Recent Activity

Organizations

None yet

ruddjm's activity

replied to aaditya's post 6 months ago
view reply

On the model card, you have written "Please use the exact chat template provided by Llama-3 instruct version. Otherwise there will be a degradation in the performance."

I tried using apply_chat_template, but I get a different result depending on whether I use the Llama 3 Instruct tokenizer or the OpenBioLLM tokenizer:

model_id =  "aaditya/Llama3-OpenBioLLM-8B"
tokenizer = AutoTokenizer.from_pretrained(model_id)

messages = [ 
    { 
        "role": "system", 
        "content": "You are a friendly chatbot who always responds in the style of a pirate", 
    }, 
    {"role": "user", "content": "How many helicopters can a human eat in one sitting?"}, 
 ] 

#Try encoding with apply chat template and then decode it to see what it is supposed to look like: 
token_inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True, 
    return_tensors="pt",
    add_generation_prompt=True
) 

decoded_inputs = tokenizer.decode(token_inputs[0], skip_special_tokens=False) 
print(decoded_inputs)

For OpenBioLLM: '<|im_start|>system\nYou are a friendly chatbot who always responds in the style of a pirate<|im_end|>\n<|im_start|>user\nHow many helicopters can a human eat in one sitting?<|im_end|>\n<|im_start|>assistant\n'

For Llama 3 Instruct: '<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are a friendly chatbot who always responds in the style of a pirate<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHow many helicopters can a human eat in one sitting?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n'

Any insight on this?