Unable to replicate generations from paper

#9
by solver1104 - opened

Hi, I'm testing some of the reasoning prompts from the TinyStories paper, and I'm getting different, and less coherent generations from this model than the results presented in the paper. For example, the paper states that the 33M model responds "his grandmother's house" in response to the prompt "On weekends Jack went to visit his grandmother whereas on weekdays he would go to school. Last weekend, when Jack was on his way to", but on HuggingFace, I'm getting a response of "the library". Any idea on what might be happening here that causes this?

I'm pretty sure the difference is the temperature and generation beams (we used temp=0, beams=5).

Sign up or log in to comment