validation scores

#10
by AntonioMartini - opened

are the validation scores in Figure 4 of the paper the ones for the normal or instruct model? I would expect the validation score for the instruct version to be lower than the normal model due to the randomisation element in the instructions generation.

Thanks,
Antonio

These are validation scores for the non-instruct dataset. On the instruct dataset, the validation loss is a bit lower (perhaps because 1. Given the header, there is considerably less entropy in the story itself, 2. The headers themselves have many predictable tokens).

AntonioMartini changed discussion status to closed

Sign up or log in to comment