What's the decoder_start_token_id and eos_token_id used in training?
#7
by
cqchangm
- opened
In large-v3 the start and end tokens are (50258, 50257) which are ("<|startoftranscript|>","<|endoftext|>").
In this model they are (50257, 50256) which are ("<|endoftext|>", "") according to added_tokens.json
Was the model finetuned this way, i.e. with <|endoftext|> at the start? Or was it just a typo?