<|eot_id|> in aphrodite-engine

#2
by med4u - opened

Using aphrodite-engine for inference I get "<|eot_id|>" at the end of every answer.
Any clue how to get rid of it? Thank you!

Double check you're using LLaMA-3 prompt format in your frontend or environment. The finetune didn't change that, so you still have to follow the proper instruction template.
If there's no issues there, it could be the file itself. LLaMA-3 used to have a tokenizer configuration issue, all the early quants and derivatives have an error. Chances are, this is the cause of your issue.
Good news is, the model doesn't have to be retrained, it's a mere config mishap. But the outdated tokenizer config has to be updated nonetheless, and the old ggufs should be updated as well.

Sign up or log in to comment