Update generation_config.json

#2
by abhi-db - opened

I noticed when using the instruct model with chat templating, that the chat template uses <|eot_id|> rather than the EOS token <|end_of_text|>. So when the assistant responds to messages it likes to use <|eot_id|> as well. Unfortunately the generation config doesn't say to stop generating on <|eot_id|> so the model keeps writing.

In the Model Card, I see that there is a workaround by manually updating eos_token_id in any generate call or pipeline:

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = pipeline(
    prompt,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)

But I think there is a simpler way to fix this! If you just update the generation_config.json to stop on both <|end_of_text|> as well as <|eot_id|>, then it should work automatically and you won't need to build the terminators.

No description provided.
Meta Llama org

Thank you @abhi-db !

pcuenq changed pull request status to merged

Sign up or log in to comment