Make prompt fully compliant with spec

#48
by pcuenq HF staff - opened

Why this prompt is slightly different from this https://huggingface.co/spaces/huggingface-projects/llama-2-13b-chat/blob/5b351de4c5dc896f73ccf93d5fc1450787d48298/model.py#L26 ? Which is the correct one? Notice the additional <s> in the beginning

Hi @federicomagnolfi-artificialy ! Great question!

<s> is the "beginning of sequence" token, a.k.a. bos. It is usually added by the tokenizer, along with the end token (eos or </s>). However, since the location of the bos and eos tokens in the system prompt has to be handled with care, we decided to make it fully explicit in that example and in the blog post. Note that in the code you posted the tokenizer is invoked with add_special_tokens=False, meaning that we are handling the special tokens ourselves and the tokenizer should not do anything about them.

In this example, the tokenizer lives in the server (we are using a text generation interface endpoint), so we can't use add_special_tokens=False. We removed the initial <s> because it will be added in the server.

This can be quite confusing, the important thing is to pay a lot of attention as you did here :)

Hi @pcuenq , I didn't notice the add_special_tokens difference, thank you for the clarification!

Cannot merge
This branch has merge conflicts in the following files:
  • app.py

Sign up or log in to comment