chat format does not match the paper
#18
by
andysalerno
- opened
Reading the paper: https://storage.googleapis.com/deepmind-media/gemma/gemma-report.pdf
The chat format is presented in table 4, and looks like this:
<start_of_turn>user
Knock knock.<end_of_turn>
<start_of_turn>model
Who’s there?<end_of_turn>model
<start_of_turn>user
Gemma.<end_of_turn>
<start_of_turn>model
Gemma who?<end_of_turn>model
Notice how the model turns always end with <end_of_turn>model
. But the user turns end with <end_of_turn>
without 'user' at the end.
I suspect this is an error in the paper and not in this repo.
That's true, thanks for flagging, and sorry about that! We will remove the two errant model
tokens after <end_of_turn>
.
suryabhupa
changed discussion status to
closed