How should names & fewshot examples be encoded?

#89
by theobjectivedad - opened

In the Llama3 prompt template, is there a preferred way to encode names, for example:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Welcome to the chat!<|eot_id|><|start_header_id|>user<|end_header_id|>

Bob: Hello there<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Alice: How can I help you?<|eot_id|><|start_header_id|>user<|end_header_id|>

Bob: I have a question.<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Alice: this must be last

Or should names get encoded in headers similar to ChatML?

Also, I am curious if there is a preferred way to encode few-shot examples (see this example).

Thank you in advance!

UPDATE 1:
For some additional background, I reviewed the Meta Llama3 prompt format guide and didn't see any references for name / few-shot example encoding. Additionally, looking at tokenizer.py, only the role string is encoded in the header, which I why I suggested the format above where names are encoded in the content body of each message.

UPDATE 2:
If name encoding hasn't been defined yet, I recommend either encoding it in the header or using a special token to clearly delineate the name from other fields in the chat. The main problem encoding the name the way I did in the example above is parsing it from a string when names are optional. This unnecessarily increases the complexity of the parser where different regular expressions must be used depending on whether names are enabled. This still makes parsing almost impossible when names are optional on a per message basis or when other rules are introduced (such at a ChatML-style few-shot example). Again, if the specification hasn't been decided I would recommend this:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Welcome to the chat!<|eot_id|><|start_header_id|>user name=Bob<|end_header_id|>

Hello there<|eot_id|><|start_header_id|>assistant name=Alice<|end_header_id|>

How can I help you?<|eot_id|><|start_header_id|>user name=Bob<|end_header_id|>

I have a question.<|eot_id|><|start_header_id|>assistant name=Alice<|end_header_id|>

this must be last
theobjectivedad changed discussion title from How should names be encoded to How should names & fewshot examples be encoded?

Sign up or log in to comment