Model repeating prompt and not learning eos token

#129
by Essacheez - opened

I am trying to fine tune mistral but the model is repeating the input prompt and is not learning the eos token.
I tried changing the tokenizer.pad_token from eos to unk_token , the sides from right to left but its not working. I already added the bos and eos token to my dataset.
Here is an example from my dataset

<s>[INST]Translate the following text from French to English: D'où venons-nous?[/INST]Where did we come from?</s>

Currently these are by settings
tokenizer.add_eos_token = False
tokenizer.add_bos_token = False
tokenizer.pad_token = tokenizer.unk_token
tokenizer.padding_side = "left"

This comment has been hidden
Essacheez changed discussion status to closed
Essacheez changed discussion status to open

Sign up or log in to comment