Checkpoints

#15

by borgr - opened Feb 29, 2024

Feb 29, 2024

There are multiple checkpoints mentioned all inside OLMo-7B repo, how could one part be with LR going to 0 and a later one in the same repo not? What does it mean about the rest of the checkpoints found in the repo?

akshitab

Ai2 org Mar 4, 2024

Hi @borgr , for the revisions from step0 to step556, we follow a linear LR schedule, and then in the last 1000 steps, we anneal the LR to 0. We found this to be better for the performance of the final model.

borgr

Mar 4, 2024

I think I didn't put the question well

I find the differences between those checkpoints unclear, specifically the ones that are part of allenai/OLMo-7B, how can the not annealed one be the one with more tokens,batches and steps?

akshitab

Ai2 org Mar 7, 2024

@borgr This might make it clearer:

borgr

Mar 8, 2024

Maybe write in the NAME and Note something comparable between the second and third row then?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment