chat format does not match the paper

#18
by andysalerno - opened

Reading the paper: https://storage.googleapis.com/deepmind-media/gemma/gemma-report.pdf

The chat format is presented in table 4, and looks like this:

<start_of_turn>user
Knock knock.<end_of_turn>
<start_of_turn>model
Who’s there?<end_of_turn>model
<start_of_turn>user
Gemma.<end_of_turn>
<start_of_turn>model
Gemma who?<end_of_turn>model

Notice how the model turns always end with <end_of_turn>model. But the user turns end with <end_of_turn> without 'user' at the end.

I suspect this is an error in the paper and not in this repo.

Google org

That's true, thanks for flagging, and sorry about that! We will remove the two errant model tokens after <end_of_turn>.

suryabhupa changed discussion status to closed

Sign up or log in to comment